|HOME SCHEDULE AUTHOR INDEX SUBJECT INDEX|
(PW026) QSAR prediction of carcinogenicity of diverse chemicals with neural networks.
Tanabe, K.1, Ohmori, N.1, Suzuki, T.2, Ono, S.1, 1 Chiba Institute of Technology, Narashino, Chiba, Japan2 Toyo University, Bunkyoku, Tokyo, Japan
ABSTRACT- Several systems to predict the carcinogenicity with QSAR were already developed, and many researchers applied to the Predictive Toxicology Challenge contest. The performances of those systems are all low (correct classification rates were below 70%), because those systems are based on linear relation between chemical structure and carcinogenicity. In this study a QSAR model was developed to predict the carcinogenicity for diverse chemicals only from information of chemical structure with high accuracy, and to compare its performance with those of contestants. A neural network (NN) is a powerful tool to analyze a nonlinear relation between carcinogenicity and structure. QSAR models of relationships between structure and carcinogenicity of chemicals were constructed by applying a multilayer NN using the back-propagation algorithm. NN was used to classify the chemicals studied into two categories, namely inactive or active. A training set of 324 chemicals and a testing set of 168 chemicals in the database of the PTC were characterized by means of three sets of molecular descriptors, Dragon, tReymers and Helma. These descriptors were entered into the input layer of a three-layered NN, and the carcinogenicity data were entered into the output layer (0 for noncarcinogenic or 1 for carcinogenic chemicals). To avoid the over-learning which is serious in an NN, the training set was equally divided into a learning set and a validation set. While an NN was trained by using the learning set, the errors between the output and teaching data for the learning, validation and test sets were counted in each cycle. The error of the learning set was gradually decreased in the training cycle, while the error of the validation set showed a minimum that was judged as an optimal training cycle. At that cycle, the classification ability of the NNs with different descriptor sets was tested on the male rat data of 168 chemicals. The correct classification rates obtained were 67.7%, 72.5% and 74.9%, using 18 tReymers, 24 Helma, and 42 tReymers + 24 Helma descriptors, respectively. The prediction accuracy is significantly improved with compared with reported values by earlier attempts using a statistical method such as regression analysis and partial least squares. Most of earlier reported values were about 60%, and the best value was 67.6% reported by T. Okada. It demonstrated the superiority of an NN as a nonlinear modeling method.
Key words: carcinogenicity, QSAR, neural network
Internet Services provided by|
Allen Press, Inc. | 810 E. 10th St. | Lawrence, Kansas 66044 USA
e-mail email@example.com | Web www.allenpress.com
All content is Copyright © 2004 SETAC