Antonio Mucherino's Research
To maintain a webpage detailing ongoing research is very time-consuming. Every single discovery,
or even a single novel idea, makes its content ancient and a new revision is necessary.
Even scheduling updates every time a new article of mine appears requires considerable efforts,
which I finally prefer to devote to research.
In a smaller scale, I'm currently trying to maintain the section of my curriculum vitae
where I'm used to give some details about my current research. My curriculum vitae can be
download through this webpage.
In brief, my research interests mainly spread over the following three subjects:
Additional information can be found in my
Here below, some short research notes that I have written, over the last years, for this webpage.
They do not cover my whole research work.
The Distance Geometry Problem (DGP) asks whether a simple weighted undirected graph
G=(V,E,d) can be embedded in a K-dimensional Euclidean space so that the
distance between every pair of embedded vertices u and v belonging to V
corresponds to the weight d(u,v) assigned to the edge (u,v), when available.
The DGP is NP-hard and has several applications; one of the most interesting is the one in
the field of biology where vertices are atoms of a molecule, and the embedding of G
corresponds to suitable three-dimensional conformations of the molecule.
Over the last years, in collaboration with
Carlile Lavor and
and other people, we have been working on the discretization of the DGP.
In 2013, I co-edited a
devoted to the DGP and its applications; moreover, a
is about to appear.
A software tool called MD-jeep,
which is distributed under the GNU General Public License (v.2),
is under development. It implements a branch-and-prune framework for discretizable DGPs
and its sources can be downloaded from this
Data mining is the nontrivial extraction of previously unknown, potentially useful
and reliable patterns from a given set of data. Data mining techniques can be mainly
divided into two categories: classification techniques and clustering techniques.
I'm co-author, with
of the textbook
Data Mining in Agriculture.
The book is devoted to data mining techniques applied to the agricultural field.
Both classification and clustering techniques are presented in the book, and many examples of
applications in agriculture and related fields are discussed. The implementations in
of simple algorithms help the reader in understanding the topics and allow to perform simple experiments.
My research in data mining is devoted to consistent biclustering for supervised classification
and feature selection.
The flexibility in the conception and use of meta-heuristics opens to
a wide domain of development and applications.
My first experience with meta-heuristics dates back to my PhD in Italy, where I was working on a
geometrical model for the simulation of protein conformations. At that time, I had implemented
a simple Simulated Annealing, with some ad-hoc perturbations inspired by the problem at hand.
Then, during my postdoc at University of Florida, I worked on a novel meta-heuristic framework that
we called Monkey Search (see below). More recently, with other colleagues, I proposed the
simulation of an environment in evolutionary algorithms. So far, we tried this novel idea
together with Ant Colony Optimization.
Proteins described by a mathematician.
Proteins are heteropolymers. They are formed by smaller molecules called amino acids, that are bonded together to form a sort of "chain".
Living organisms contain plenty of proteins, which perform several functions of vital importance.
Each amino acid along the protein chain contains a Carbon atom called Cα,
which is chemically bonded to three different aggregates of atoms.
Two of these aggregates are common to each amino acid: the amino group (NH2) and the carboxyl group (COOH).
Only one aggregate of atoms makes each amino acid different from one another: the so-called side chain R.
Only 20 amino acids are involved into the protein synthesis.
These molecules are under genetic control: the amino acid sequence of a protein is contained into the
corresponding gene of the DNA molecule. The three-dimensional conformation of a protein molecule is of great interest,
because it defines its dynamics in the cell, and therefore its function.
Different properties of the side chains allow for classifying the amino acids on the basis of their features.
A very important feature of the amino acids is the polarity of their side chains R. A side chain is polar if
it has a global or local electric charge. There is an attraction between positive and negative charged molecules and
there is repulsion between molecules with the same charge. Since the protein solvent, i.e. water, is polar and globally uncharged,
there is attraction between the water molecules and the polar R groups. It follows that the amino acids having a polar side
chain are hydrophilic, because they tend to interact with the solvent. Other amino acids are buried inside the protein because
of the hydrophilic amino acid behavior. Therefore, they are called hydrophobic. This feature of the amino acids plays a
fundamental role in the stabilization of protein molecules. The attractive and repulsive van der Waals forces are important as well.
They can be modeled, for instance, by the Lennard Jones potential (see below).
in order to get more information on the amino acids and on their properties.
Each amino acid of the protein chain is bonded to the previous and the successive amino acid through a peptide bond. More precisely,
the carboxyl group of the ith amino acid is bonded to the amino group of the (i+1)th amino acid,
i.e. a covalent bond keeps bonded the C atom of the first group and the N atom of the second one.
The peptide bond is strong enough to force every atom between two consecutive Cα atoms on the same plane.
Therefore, the movement possibilities of the amino acids on the chain are constrained.
More information about proteins and the computational methods for finding their stable conformations can be found on
this web page
Arnold Neumaier web site.
Experimentally determined protein conformations can be downloaded from the
Protein Data Bank
(PDB), that currently contains about 148000
Why are we so interested in proteins?
C. M. Dobson, A. Sali and M. Karplus wrote:
"An understand of folding is important for the analysis of many events involved in cellular regulation,
the design of proteins with novel functions, the utilization of sequence information from the various genome
projects, and the development of novel therapeutic strategies for treating or preventing debilitating human diseases
that are associated with the failure of proteins to fold correctly".
Analyzing and simulating the conformation of protein molecules.
At the beginning of my career, I performed many analyses on protein conformations, where particular attention
was given to their geometric features. I also worked on a model for protein simulations which is mainly based
on geometric properties of protein conformations. Differently from other models for protein prediction, the considered
model does not rely on chemical and physical forces involved in the folding process, but it rather attempts the simulation
of protein conformations by exploiting their geometric properties. Computational experiments proved that the geometric properties
of protein conformations play an important role in the folding process. Indeed, among the conformations that were simulated,
there are some which are "very similar" to the ones protein molecules can actually have in nature.
Exploring the optimal conformations of clusters of molecules.
Finding the ground state conformations of clusters of molecules is one of the most difficult global optimization problems.
Such clusters are regulated by potential energy functions, such as the Lennard Jones and the Morse potential energy.
The Lennard Jones potential is very simple but sufficiently accurate for noble gas, and it also provides approximations for structures of nickel and gold.
The Morse potential has a parameter which determines the width of the potential well; Morse clusters show a wide variety of structural behaviors
in function of this parameter. Both potential functions have several local minima in which methods for optimization may get stuck at.
For instance, a Lennard Jones cluster formed by 13 atoms has about 1000 local minima, and the number of local minima increases exponentially
with the size of the cluster. The Lennard Jones and Morse optimal clusters usually have particular geometric motifs,
which can be exploited during the optimization process. The
Cambridge Cluster Database
contains experimentally obtained clusters. Some of them may just be local minima:
the database is open to newly generated clusters having a lower energy function value.
Climbing trees like a monkey.
A monkey can survive in a forest of trees because it is able to focus its food researches where it already found good food to eat.
During my period of work in the United States,
and I proposed a novel method for global optimization which is inspired by the behavior of a monkey.
We named this method Monkey Search. The trees the monkey explores contain solutions to the optimization problem to be solved,
and branches of the tree represent perturbations able to transform one solution into another. At each step,
the monkey chooses the branches to climb by applying a random mechanism, which is based on the objective function values of
the new corresponding solutions. The monkey prefers better solutions, but it is not forced to choose them.
Rather than a completely new meta-heuristic approach for global optimization, the Monkey Search can be considered as a general framework
(describing the monkey behavior) where other ideas and strategies for global optimization are also employed. Perturbations on candidate
solutions can be taken from other methods for optimization (such as Genetic Algorithms, Harmony Search, Ant Colony Optimization, etc)
and others can be inspired by the optimization problem itself. This leads to a wide range of variants for this meta-heuristic search.
We implemented the Monkey Search in C programming language for solving particular classes of optimization problems,
such as the problem of finding Lennard Jones and Morse clusters, as well as for solving the distance geometry problem (see above).
implemented the Monkey Search for solving the Multidimensional Assignment Problem (MAP).