This page contains a list of papers/tools published before 2023 and that have been added to the main readme of this repo in the past.
- MolDQN - Optimization of molecules via Deep Reinforcement Learning. [Paper]
- Pasithea - Deep Molecular Dreaming: Inverse Machine Learning for de-novo molecular design and interpretability with surjective representations. [Paper]
- fragment-based-dgm - A Deep Generative Model for fragment-based molecule generation. [Paper]
- MAT - Molecule Attention Transformer for molecular prediction tasks. [Paper].
- GLAMOUR - Chemistry-informed Macromolecule Graph Representation for Similarity Computation, Unsupervised and Supervised Learning. [Paper]
- Transformer-M - One Transformer that can understand both 2D & 3D molecular data. [Paper]
- SynNet - An amortized approach to synthetic tree generation using neural networks. This model can serve as both a synthesis planning tool and as a tool for synthesizable molecular design. [Paper]
- SPIB - SPIB (State Predictive Information Bottleneck) is a Deep Learning-based framework that learns the reaction coordinates from high dimensional molecular simulation trajectories. [Paper]
- MolT5 - A self-supervised learning framework for pretraining models on a vast amount of unlabeled natural language text and molecule strings. [Paper]
- DIONYSUS - An extensive study of the calibration and generalizability of probabilistic Machine Learning models on small chemical datasets. [Paper]
- NVIDIA-PCQM4Mv2 - Heterogenous ensemble of models for Molecular Property Prediction. [Paper]
- JAEGER - JT-VAE Generative Modeling (JAEGER) is a deep generative approach for small-molecule design. It is based on the Junction-Tree Variational Auto-Encoder (JT-VAE) method. [JT-VAE paper]
- MoLFormer - A large-scale chemical language model designed with the intention of learning a model trained on small molecules which are represented as SMILES strings. [Paper]
- Mol-CycleGAN - A generative model for molecular optimization. [Paper]
- EDM - Equivariant Diffusion for Molecule Generation in 3D. [Paper]
- Meaningful Protein Representation - Learning meaningful representations of protein sequences using a VAE. [Article] [Paper].
- TAPE - Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. [Paper]
- Protein Sequence Embedding - Learning protein sequence embeddings using information from structure. [Paper]
- IdpGAN - A GAN to generate different 3D conformations for intrinsically disordered proteins given their sequences. [Article]
- PocketMiner - A tool for predicting the locations of cryptic pockets from single protein structures. [Paper]
- progen2 - A suite of open-sourced projects and models for protein engineering and design. [Paper]
- TransformerCPI - Improving compound–protein interaction prediction by sequence-based Deep Learning with self-attention mechanism and label reversal experiments. [Paper]
- EvoBind - In silico directed evolution of peptide binders with AlphaFold2. [Paper]
- ProtGPT2 - A deep unsupervised language model for protein design. [Article]
- Bio Embeddings - General purpose Python embedders based on open models trained on biological sequence representations. [Paper]
- ProteinMPNN - Robust Deep Learning based protein sequence design. [Paper]
- LigandMPNN - Atomic context-conditioned protein sequence design. [Paper]
- MoLPC - Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search. [Article]
- foldingdiff - A diffusion model for generating novel protein backbone structures. [Paper]
- ProGen - Suite of open-sourced projects and models for protein engineering and design. [Paper]
- DeepAb - Antibody structure prediction using interpretable Deep Learning. [Paper]
- cdna-display-proteolysis-pipeline - Mega-scale experimental analysis of protein folding stability in biology and protein design. [Paper]
- GearNet - Geometric pretraining methods for Protein Structure Representation Learning. [Paper]
- RFdiffusion - Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative models. [Paper]
- ProteinGLUE - A multi-task benchmark suite for self-supervised protein modeling. [Paper]