-
ENSEEIHT, Toulouse, France. March 2010. Invited by Patrick R. Amestoy.
|
Consistent Biclustering for Supervised Classifications.
Biclustering aims at finding simultaneous partitions of the samples and of the features (used for the representation of the samples) of a given set of data. It allows for finding classifications of the samples, while showing the features that are relevant for the found classification. Given a set of data, consistent biclusterings allow to construct correct classifications for the features from the classifications of the samples, and vice versa. Therefore, they can be exploited for classification purposes: once a consistent biclustering is obtained from a certain training set, the found classification of its features can be used for finding classifications for samples which do not belong to the training set. The main issue is that training sets from real-life applications usually do not allow for consistent biclusterings. In this seminar, I will present a feature selection problem for finding consistent biclusterings by removing a minimal subset of non-relevant features from a training set, and I will show a new heuristic algorithm for its solution. Experiments on sets of gene expression data will be presented.
| |
-
IRISA, INRIA, Rennes, France. December 2009. Invited by Rumen Andonov.
See web page.
|
Recent Developments on the Molecular Distance Geometry Problem.
The Molecular Distance Geometry Problem (MDGP) is the problem of finding the conformation of a molecule starting from some known distances between pairs of its atoms. Such distances can be estimated through experimental techniques, such as the Nuclear Magnetic Resonance. In its basic form, the MDGP is a constraint satisfaction problem, but it is usually reformulated as a global continuous optimization problem, where a penalty function is introduced and minimized for finding solutions to the problem. In the hypothesis in which some particular assumptions are satisfied, the MDGP can also be reformulated as a combinatorial optimization problem, that is referred to as Discretizable MDGP (DMDGP). The combinatorial reformulation allows to obtain better-quality solutions to the problem by the employment of an exact algorithm. During the seminar, I'll introduce the DMDGP and I'll discuss some recent related studies. In particular, since not all the instances of the MDGP satisfy the assumptions for the DMDGP, I'll show some recent efforts which are devoted to methods for converting general instances of the MDGP into instances of the DMDGP. Finally, I'll present some strategies for the management of instances affected by experimental errors and noise.
| |
-
Université Paris 11, Orsay, France. April 2009. Invited by Abdel Lisser.
LAMSADE, Université Paris Dauphine, Paris, France. March 2009. Invited by Dr. A. Ridha Mahjoub.
See web page.
|
A Combinatorial Reformulation for the Molecular Distance Geometry Problem.
Many real life problems can be formulated as global optimization problems. Such problems are usually difficult to solve, because of the complexity of the objective function and constraints, and because of the large number of involved variables. The Molecular Distance Geometry Problem (MDGP) is the problem of finding the Cartesian coordinates of the atoms of a given molecule when some of the relative distances between pairs of atoms are known. This problem is usually formulated as a global continuous optimization problem. However, if some assumptions are satisfied, the continuous optimization problem can be reformulated as a combinatorial problem. This allows to reduce drastically the search domain of the optimization problem, and to improve the quality of the obtained solutions. A Branch and Prune (BP) algorithm is employed for solving the combinatorial problem.
| |
-
DIIGA, Università Politecnica delle Marche, Ancona, Italy. February 2009. Invited by Fabrizio Marinelli.
See web page.
Download the slides.
|
Data Mining and Applications.
Data mining techniques are nowadays used in many real-life applications. Data mining is the extraction of previously unknown and potentially useful information from large sets of data. For example, the huge net of web pages and their interconnections is studied and analyzed by data mining techniques with the aim of improving the performances of web search engines. Another important example of large set of data to be analyzed is given by the human genome. The information extracted from these sets of data can be transformed into knowledge. The aim of this seminar is to briefly survey the most used techniques for solving data mining problems. Both clustering and classifications techniques will be discussed, and the basic ideas behind some of these techniques will be presented. Among the others, the k-means algorithm for clustering will be discussed, as well as techniques for classification, such as k-Nearest Neighbor and Support Vector Machines. Part of the seminar will be devoted to the recent biclustering techniques. Some real-life applications will be showed.
Main reference: A. Mucherino, P.J. Papajorgji, P.M. Pardalos, "Data Mining in Agriculture", Springer, 2009.
| |
-
Department of Industrial Engineering, University of Florida, USA. January 2007. Invited by Panos Pardalos.
|
A Global Optimization Problem Arising from a Geometric Model to Protein Folding.
The aim of this seminar is to give a brief overview of the research I performed during my PhD in Computational Biology. I will discuss a model for protein simulations which is mainly based on geometric properties of the protein conformations. This model brings to the formulation of a global optimization problem, which I'm currently solving by applying meta-heuristic searches, such as Simulated Annealing and Harmony Search. I will also present some experiences in a computational parallel environment.
| |
-
IASI-CNR, Rome, Italy. September 2003. Invited by M. Sciandrone.
Download the slides (in Italian only, sorry).
|
Development of Parallel Software Procedures with MPI.
Parallel software for MIMD computers having a distributed memory can be developed by using the MPI system. During this seminar, the installation procedure of MPI on Beowulf clusters of computers will be briefly described. Then, attention will be devoted to the main MPI functions. Simple parallel programs will be presented and discussed in details. The compilation and execution of such parallel programs will be shown. Attention will be given to groups of processes and virtual topologies of processes.
| |
|