Combination of QSAR models and chemical clustering as a NAM for computational toxicology.
In the course of setting up consistent alternatives to animal testing in agreement with regulatory guidelines, QSAR based methodologies have been thoroughly assessed. The present retrospective study exemplifies the robustness of five Quantitative Structure-Activity Relationship (QSAR) models – i.e. carcinogenic, mutagenic, reprotoxic, persistent/bio-accumulative/toxic (PBT), and endocrine disruption. Those models which are related to the founding endpoints of the REACH SVHC list were built from respective datasets via selections of chemical descriptors and a Random Forest type algorithm. The performance of the models was measured through the sensitivity factor with a dataset of 256 compounds deriving from the REACH SVHC list. Results showed an overall agreement rate of 75.4% with regards to the actual classification of the SVHC dataset with an applicability score of 100% (only organic compounds are considered). Moreover, each of the models was able to correctly classify the compounds with a significant sensitivity for carcinogenic, mutagenic, reprotoxic, PBT, and endocrine disruption – i.e. 93.3%, 85.7%, 68.0%, 62.1%, and 100.0%, respectively. It
is noteworthy that carcinogenic, mutagenic ,and endocrine disruption QSARs showed the best prediction rates with a sensitivity greater than 85%.
In order to investigate further the selected SVHC dataset, a complementary clustering analysis was carried out: for each toxicity endpoint, the subset molecules were grouped by chemical similarity using the proprietary spherical harmonic (SH) descriptors taking into account the shape and the physico-chemical properties of the molecules. Interestingly, the resulting clustering obtained for each
endpoint pointed out “representative compounds” that could be further used as molecular templates to screen novel compounds with the aim of designing safer ones. Altogether, consistency and robustness of the QSARs combined with a clustering analysis support the suitability of this new approach methodology (NAM) to a prospective screening to characterize potential toxicity of novel substances from synthetic or natural sources.
This NAM is a part of SAFETY BY DESIGN®, the new solution of services and software for toxicity prediction and characterization of chemical substances.
New spherical harmonic based descriptors to efficiently fuel QSAR methodology : Endocrine disruptor case study
Two-dimension quantitative structure-activity relationship (2D QSAR) has been a standard methodology for the last decade whereas multiple three-dimension (3D) descriptors have been tested with mitigated successes. In the present study, we propose a new set of highly informative and compact 3D descriptors from spherical harmonic (SH) based representations covering both the geometrical shape and the pharmacophoric features of a molecule. The process consists in placing a molecule on three different axes – each one is captured by its own set of spherical harmonics. SH related expansions are used to create compact and rotation independent descriptors – e.g. 32 floating coefficients – to describe a conformer of the molecule. These descriptors were then applied to a QSAR model of toxicity which was built from the reference dataset of the CERAPP project – a collaborative project that developed a consensus model of toxicity for the endocrine disruption . The QSAR model was trained with SH based descriptors and binding activity to the estrogen receptor was considered in this study. The resulting model yielded a balance accuracy of 0.87 on the evaluation dataset. Furthermore, by combining SH and 2D descriptors from the RDKit suite , the subsequent QSAR model gave rise to a balance accuracy of 0.91 on the evaluation dataset, positioning its performance at the high level of the consensus model obtained by the CERAPP project.
GESSE : The “magic triangle”
The “magic triangle” of “drugs, targets, side effects” (SEs) is the new “holy grail” of the pharmaceutical industry. This figure shows a subset of a triangular matrix associating the most significant drug–target relationships predicted by the authors’ GES algorithm with the SEs for those drugs predicted by the same authors’ GESSE approach. Combining GES with GESSE allows the physicochemical space of drugs, the polypharmacologically relevant biological subspace of drug targets, and the phenotypic space of SEs to be related computationally.
GES Polypharmacology Fingerprints: A Novel Approach for Drug Repositioning
Polypharmacology is now recognized as an increasingly important aspect of drug design. We previously introduced the Gaussian ensemble screening (GES) approach to predict relationships between drug classes rapidly without requiring thousands of bootstrap comparisons as in current promiscuity prediction approaches. Here we present the GES “computational polypharmacology fingerprint” (CPF), the first target fingerprint to encode drug promiscuity information. The similarity between the 3D shapes and chemical properties of ligands is calculated using PARAFIT and our HPCC programs to give a consensus shape-plus-chemistry ligand similarity score, and ligand promiscuity for a given set of targets is quantified using the GES fingerprints. To demonstrate our approach, we calculated the CPFs for a set of ligands from DrugBank that are related to some 800 targets. The performance of the approach was measured by comparing our CPF with an in-house “experimental polypharmacology fingerprint” (EPF) built using publicly available experimental data for the targets that comprise the fingerprint. Overall, the GES CPF gives very low fall-out while still giving high precision. We present examples of polypharmacology relationships predicted by our approach that have been experimentally validated. This demonstrates that our CPF approach can successfully describe drug–target relationships and can serve as a novel drug repurposing method for proposing new targets for preclinical compounds and clinical drug candidates.
A highly specific and sensitive pharmacophore model for identifying CXCR4 antagonists. Comparison with docking and shape-matching virtual screening performance
HIV infection is initiated by fusion of the virus with the target cell through binding of the viral gp120 protein with the CD4 cell surface receptor protein and the CXCR4 or CCR5 coreceptors. There is currently considerable interest in developing novel ligands that can modulate the conformations of these coreceptors and, hence, ultimately block virus–cell fusion. Herein, we present a highly specific and sensitive pharmacophore model for identifying CXCR4 antagonists that could potentially serve as HIV entry inhibitors. Its performance was compared with docking and shape-matching virtual screening approaches using 3OE6 CXCR4 crystal structure and high-affinity ligands as query molecules, respectively. The performance of these methods was compared by virtually screening a library assembled by us, consisting of 228 high affinity known CXCR4 inhibitors from 20 different chemotype families and 4696 similar presumed inactive molecules. The area under the ROC plot (AUC), enrichment factors, and diversity of the resulting virtual hit lists was analyzed. Results show that our pharmacophore model achieves the highest VS performance among all the docking and shape-based scoring functions used. Its high selectivity and sensitivity makes our pharmacophore a very good filter for identifying CXCR4 antagonists.