Molecular Descrip
Molecular Descrip
MOLECULAR DESCRIPTORS
1. INTRODUCTION
In this chapter we will describe commonly used molecular descriptors that can
be applied to relatively large data sets. These include descriptors that represent
properties of whole molecules such as log P and molar refractivity; descriptors
that can be calculated from 2D graph representations of structures such as
topological indices and 2D fingerprints; and descriptors such as pharmacophore
keys that require 3D representations of structures. We conclude by considering
how descriptors can be manipulated and combined together. Our emphasis will be
on descriptors that are properties of the whole molecule, rather than of substituents.
The latter type of descriptor was very important in the development of QSARs, and
will be briefly covered in Chapter 4.
53
54 An Introduction to Chemoinformatics
Perhaps the simplest descriptors are based on simple counts of features such
as hydrogen bond donors, hydrogen bond acceptors, ring systems (including
aromatic rings), rotatable bonds and molecular weight. Many of these features
can be defined as substructures or molecular fragments and so their frequency
of occurrence can be readily calculated from a 2D connection table using the
techniques developed for substructure search. For most applications, however,
these descriptors are unlikely to offer sufficient discriminating power if used in
isolation and so they are often combined with other descriptors.
While the approach worked well within a congeneric series of compounds, the
π-values were found not to be additive across different series. For example, it is
inappropriate to use π-values derived from a benzene parent on electron-deficient
rings such as pyridine.