|
PARENT SESSION 83 - QSAR Approaches 8:00 AM to 6:30 PM, Wednesday, 15 May 2002 Exhibition Area
(83-14) Prediction of Activity Coefficients of Organic Compounds in Water at Infinite Dilution with an Integrated SOM-fuzzy ARTMAP Neural System.
Espinosa, Gabriela*,1, Arenas, Alex1, Giralt, Francesc1, Ferre-Giné, Joan1, Cohen, Yoram2, Amat, Luis3, Gironés, Xavier3, Carbó-Dorca, Ramón3, 1 Universitat Rovira i Virgili, Departament d’Enginyeria Química, Tarragona, Spain, Catalunya, Spain, Tarragona2 University of California, Los Angeles, Department of Chemical Engineering, Los Angeles, CA3 Institut de Química Computacional, Universitat de Girona, Girona, Spain., Catalunya, Spain, Girona
ABSTRACT- Neural networks (NN) have evolved to become a powerful tool for the development of QSPRs to estimate physicochemical properties. In particular, various QSPRs have been proposed to estimate physicochemical and thermodynamic properties of chemicals of environmental concern. Among the various environmental properties, knowledge of the infinite dilution activity coefficient is of particular interest since it useful when estimating aqueous solubilities and Henry constants for organics. In the present study, a new NN based QSPR is proposed for the infinite dilution activity coefficient for organics in water. The present approach is an integrated methodology based on neural classifiers, such as Self Organizing Maps (SOM), fuzzy ART and predictive fuzzy ARTMAP NN developed and applied to building a robust QSPR model for ln in water. The approach consists of the following three steps:1. selection of the optimal set of descriptors, from a pool of available topological and quantum molecular parameters by using SOMs; 2. preclassification of the complete collection of data with fuzzy ART to separate data into optimal groups to train and test the model; 3. QSPR model building for the target variable with the predictive fuzzy ARTMAP neural classifier. The infinite dilution activity coefficient was developed based on a data set of 325 diverse organic compounds. A pool of topological and quantum chemical information, including molecular similarity measures, was selected to represent the molecular information for correlating the infinite dilution activity coefficient. SOM were first used to classify the compounds according to these molecular parameters and their experimental activity. The best subset of descriptors was obtained by choosing from each cluster, the index with the highest correlation with the target variable in order of decreasing correlation. This process was terminated when the dissimilarity measure increased, indicating that the inclusion of more indicators would not add supplementary information. The optimal set of descriptors, was used as input to a fuzzy ARTMAP architecture modified to effect predictive capabilities. Finally a fuzzy ART network applied to select the compounds for training and for testing the interpolation and extrapolation capabilities of the fuzzy ARTMAP based QSPR model
Key words: Neural Networks, Self organized maps, QSPR/QSAR, Activity Coefficients
|