Skip to content

Latest commit

 

History

History
60 lines (37 loc) · 3.45 KB

File metadata and controls

60 lines (37 loc) · 3.45 KB

Small Molecular Graph Generation for Drug Discovery

colab

With the advent of deep learning, drug development can be sped up just by learning the patterns within the molecules regarding their chemical properties and composition. The pursuit of good candidates for drugs can be achieved using the generative model which can extrapolate the unseen molecular structure. In this project, one of the most popular generative models, Generative Adversarial Network or GAN, is utilized. The generator of GAN consists of MLP, and the discriminator of GAN consists of R-GCN + MLP. Nowadays, there are plenty of open-sourced datasets that can be used for this purpose such as the QM9 (Quantum Machines 9) dataset. The GAN model is trained on QM9 dataset and its performances are assessed by means of molecular metrics, i.e., quantitative estimate of druglikeness (QED), solubility (defined as the log octanol-water partition coefficient or logP), synthetizability, natural product, drug candidate, valid, unique, novel, and diversity.

Experiment

All of the experiments are summed up in this notebook.

Result

Quantitative Result

The performance of the model through a normally distributed latent vector sample in 6561 runs against the QM9 dataset is presented below.

Metrics Score
QED 0.406
Solubility 0.317
Synthetizability 0.344
Natural Product 0.758
Drug Candidate 0.478
Valid 0.797
Unique 0.033
Novel 0.790
Diversity 0.567

Loss Curve

loss_curve
GAN's generator and discriminator loss curve in the training process.

Qualitative Result

Here are some samples of the qualitative results of the model.

qualitative_result
The qualitative results of the generated molecules. The chemical structure, the SMILES representation, and the QED scores are provided.

Credit