Given a set of data, biclustering aims at finding simultaneous partitions in biclusters of its samples and of the features which are used for representing the samples. Consistent biclusterings allow to obtain correct classifications of the samples from the known classification of the features, and vice versa, and they are very useful for performing supervised classifications. The problem of finding consistent biclusterings can be seen as a feature selection problem, where the features that are not relevant for classification purposes are removed from the set of data, while the total number of features is maximized in order to preserve information. This feature selection problem can be formulated as a linear fractional 0-1 optimization problem. We propose a reformulation of this problem as a bilevel optimization problem, and we present a heuristic algorithm for an efficient solution of the reformulated problem. Computational experiments show that the presented algorithm is able to find better solutions with respect to the ones obtained by employing previously presented heuristic algorithms.