QSAR model reproducibility and applicability: a case study of rate constants of hydroxy radical reaction models applied to Polybrominated Diphenyl Ethers and (Benzo-)Triazoles
Articolo
Data di Pubblicazione:
2011
Abstract:
The crucial importance of the three central OECD principles for quantitative structure-activity relationship
(QSAR) model validation is highlighted in a case study of tropospheric degradation of volatile organic compounds
(VOCs) by OH, applied to two CADASTER chemical classes (PBDEs and (benzo-)triazoles). The application of any
QSAR model to chemicals without experimental data largely depends on model reproducibility by the user. The reproducibility
of an unambiguous algorithm (OECD Principle 2) is guaranteed by redeveloping MLR models based on both
updated version of DRAGON software for molecular descriptors calculation and some freely available online descriptors.
The Genetic Algorithm has confirmed its ability to always select the most informative descriptors independently
on the input pool of variables. The ability of the GA-selected descriptors to model chemicals not used in model development
is verified by three different splittings (random by response, K-ANN and K-means clustering), thus ensuring the
external predictivity of the new models, independently of the training/prediction set composition (OECD Principle 4).
The relevance of checking the structural applicability domain (OECD Principle 3) becomes very evident on comparing
the predictions for CADASTER chemicals, using the new models proposed herein, with those obtained by EPI Suite.
(QSAR) model validation is highlighted in a case study of tropospheric degradation of volatile organic compounds
(VOCs) by OH, applied to two CADASTER chemical classes (PBDEs and (benzo-)triazoles). The application of any
QSAR model to chemicals without experimental data largely depends on model reproducibility by the user. The reproducibility
of an unambiguous algorithm (OECD Principle 2) is guaranteed by redeveloping MLR models based on both
updated version of DRAGON software for molecular descriptors calculation and some freely available online descriptors.
The Genetic Algorithm has confirmed its ability to always select the most informative descriptors independently
on the input pool of variables. The ability of the GA-selected descriptors to model chemicals not used in model development
is verified by three different splittings (random by response, K-ANN and K-means clustering), thus ensuring the
external predictivity of the new models, independently of the training/prediction set composition (OECD Principle 4).
The relevance of checking the structural applicability domain (OECD Principle 3) becomes very evident on comparing
the predictions for CADASTER chemicals, using the new models proposed herein, with those obtained by EPI Suite.
Tipologia CRIS:
Articolo su Rivista
Keywords:
reproducible algorithm; molecular descriptors; external validation; applicability domain; CADASTER
chemicals
Elenco autori:
Roy, PARTHA PRATIM; Kovarich, Simona; Gramatica, Paola
Link alla scheda completa:
Pubblicato in: