HOME     SCHEDULE     AUTHOR INDEX     SUBJECT INDEX         

PARENT SESSION

1F - QSAR
Hall 13
8:30 AM - 12:30 PM, Tuesday, 29 April 2003
Chair: Schüürmann, G.1, 1
Co-chair: Verhaar, H.J.M.2, Cronin, M.3, 2 3

(TU13/9) Carcinogenicity Prediction by using SOM Prototypes & Fuzzy ARTMAP Neural Networks.

Espinosa Porragas, Gabriela 1, Rallo, Robert2, Arenas, Alex2, Giralt i Prat, Francesc1, Carbo-Dorca, Ramon3, Cohen, Yoram4, 1 Departament d'Enginyeria Química. Escola Tècnica Superior d’Enginyeria Química (ETSEQ), Tarragona, Tarragona, Spain2 Departament d'Enginyeria Informàtica i Matemàtiques. Escola Tècnica Superior d’Enginyeria (ETSE), Tarragona, Tarragona, Spain3 Institut de Quimica computacional, Universitat de Girona, Girona, Girona, Spain4 Department of Chemical Engineering, University of California, Los Angeles (UCLA), Los Angeles, Los Angeles, USA

ABSTRACT- An integrated methodology using self-organized maps (SOMs) and fuzzy ARTMAP neural networks has been applied to predict the carcinogenicity TD50 index of 104 aromatic compounds with nitrogen-containing substituents. SOMs have been applied to select from a given pool of 44 descriptors the most relevant subset needed to build reliable QSAR models based on fuzzy ARTMAP. The pool of descriptors used in the present work incorporated both topological and quantum information. The quantum descriptors included measures of quantum similarity, i.e. of the resemblance between two molecules and of the properties of the molecules themselves (self-similarity). The topological component maps for each molecular descriptor and for the target activity variable were obtained and then classified into clusters on the basis of either curvilinear component analyses or by using Kohonen maps again in conjunction with the Davies-Bouldin index . The best subset of descriptors was obtained by choosing a representative from each cluster, in particular the index that presented the highest correlation with the target variable, and additional indices afterwards in order of decreasing correlation. Also, variables were ranked according to correlations of their C-planes with that of the target variable. The selection process ended in all cases when a dissimilarity measure between the maps for the different sets of descriptors reached a minimum value, indicating that the inclusion of more descriptors did not add relevant information, i.e., the further inclusion of indices added noise. The optimal subset of descriptors was finally used as input to several algorithms to model the QSAR, including MLR-PLS, backpropagation neural networks, and a fuzzy ARTMAP architecture modified to effect predictive capabilities. The proposed methodology to select indices yielded the best predictions in all cases when compared to statistical methods based on correlations and merit indices. Fuzzy ARTMAP yielded the best predictions for the carcinogenicity TD50 values.

Key words: Neural Networks, QSAR, Carcinogenicity, Modelling