Feature Selection in Time-Series Motion Databases
F. Elain, A. Mucherino, L. Hoyet, R. Kulpa

The selection of relevant features in large databases is one of the most important and challenging problems in data mining. Samples forming a given database are generally described by a predefined set of features, and the situation where not all such features can be used for classification purposes needs very often to be faced in real applications. This situation is very typical when the database is related to a phenomenon whose characteristics are not well known. In this context, the extraction of relevant features can therefore also provide additional information on the studied phenomena. We tackle the feature selection problem from an optimization point of view, by reducing it to the problem of finding a maximal consistent "clustering" grouping together the samples and the features of the database. In this work, we extend this approach to dynamical databases, where features are not represented by only one real value, but they are rather given as sequences of a predefined number of real values. Our main contribution consists in proposing an alternative representation of the database so that it fits with a tridimensional matrix with no missing entries, from which a consistent triclustering can be obtained.