Model Details

ptp

None

Project
Pharmbio
Project Owner
hamzaimran08@gmail.com

Version

v0.1.0

Uploaded

Aug. 19, 2021, 12:50 p.m.


Download

Model Card for Predictive Target Profile

API and Docs


API

The API for the model can be accessed via this link.

There are also user interfaces PredGUI and PredGUIMM where the API is used.

Model Details


Overview

This model predicts the interaction of a molecule represented by its chemical structure towards 31 targets (ACHE, ADORA2A, ADRB1, ADRB2, AR, AVPR1A, CCKAR, CHRM1, CHRM2, CHRM3, CNR1, CNR2, DRD1, DRD2, EDNRA, HTR1A, HTR2A, KCNH2, LCK, MAOA, NR3C1, OPRD1, OPRK1, OPRM1, PDE3A, PTGS1, PTGS2, SCN5A, SLC6A2, SLC6A3, SLC6A4) selected for their utility in broad early hazard assessment. The model uses Conformal Prediction for delivering prediction intervals for each prediction, and accepts chemical structures as input in SMILES or MOL format.


Version

1.0

Owners

  • Jonathan Alvarsson

References

  • Lampa S, Alvarsson J, Arvidsson Mc Shane S, Berg A, Ahlberg E, Spjuth O Predicting off-target binding profiles with confidence using Conformal Prediction Frontiers in Pharmacology. 9, 1256. (2018). DOI: 10.3389/fphar.2018.01256

Model Architecture

Chemicals were represented by the signature molecular descriptor and support vector machines were used as the underlying machine learning method. By using conformal prediction, the results from predictions come in the form of confidence p-values for each class. Data preprocessing and model development was implemented as a SciPipe workflow to enable reproducible models. For more details including hyper-parameter tuning, see the scientific manuscript https://dx.doi.org/10.3389/fphar.2018.01256.


Input Format

  • A chemical structure in SMILES or MOL format
  • A confidence level for predictions

Output Format

A prediction interval for each of the 31 targets associated with chemical liabilities.

Considerations


Use Cases

  • For novel compounds in drug discovery, the model can serve as early alerts of potential off-target interactions that would warrant additional experiments to rule out potential chemical liabilities.

Accuracy



Efficiency metrics (M Criterion, Observed Fuzziness and Class-Averaged Observed Fuzziness) for Dataset1, Dataset2, and Dataset3. (A) Dataset2 without extending with assumed non-actives. Circles show individual results from the three replicate runs that were run, while the lines show the median value from the individual replicate results. Targets are here sorted by number of active compounds. (B) Dataset2 after extending with assumed non-actives. Circles show individual results from the three replicate runs that were run, while the lines show the median value from the individual replicate results. Targets are here sorted by number of active compounds. (C) Dataset3, the 10 largest target datasets, which were not extended with assumed non-actives. Targets are here sorted by total number of compounds.


Graph

Predicted vs. observed labels, for all targets, for the prediction data, at confidence level 0.8 (A) and 0.9 (B). ā€œAā€ denotes active compounds, and ā€œNā€ denotes non-active compounds. The x-axis show observed labels (as found in ExCAPE-DB), while the y-axis show the set of predicted labels. The areas of the circles are proportional to the number of SAR data points for each observed label/predicted label combination.


Graph2