Active Hackathon

PyTorch Releases Drug Discovery Platform “TorchDrug”

TorchDrug covers many recent techniques such as graph machine learning, deep generative models, and reinforcement learning.

PyTorch recently announced the release of its machine learning drug discovery platform TorchDrug to accelerate drug discovery research. The library is open-sourced and can be installed through pip if you have PyTorch and torch-scatter installed using 

pip install torchdrug, or through 


Sign up for your weekly dose of what's up in emerging technology.

conda conda install -c milagraph -c conda-forge torchdrug.

TorchDrug covers many recent techniques such as graph machine learning, deep generative models, and reinforcement learning. It also provides reusable training and evaluation routines for popular drug discovery tasks, including property prediction, pretrained molecular representations, de novo molecule design, retrosynthesis and biomedical knowledge graph reasoning. It is easy to build a prototype for one’s own dataset and application based on these techniques and modules.

For advanced users, the platform provides multiple levels of building blocks for different customisation demands. These include low-level data structures and operations (e.g. molecules and graph masking), mid-level layers and modules (e.g. graph convolutions and GNNs) and high-level task routines (e.g. property prediction). TorchDrug is flexible for all kinds of customisation. It also provides graph data structures and operations for manipulating biomedical objects, as well as reusable layers, models and tasks for building machine learning models.

The core data structures of TorchDrug are graphs, which can be used to represent a wide range of biological objects, including molecules, proteins and biomedical knowledge graphs. Visualisation API in the library can be used to check graph objects.

PackedGraph data structure, which builds a unified large graph and re-index each small graph in the batch, can be used to create a batch of variable-size graphs.

Code for calculating a batch of 4 molecules: 

mols=data.PackedMolecule.from_smiles(["CCSCCSP(=S)(OC)OC", "CCOC(=O)N", "N(Nc1ccccc1)c2ccccc2", "NC(=O)c1cccnc1"])
mols = mols.cuda()
# PackedMolecule(batch_size=4, num_nodes=[12, 6, 14, 9], num_edges=[22, 10, 30, 18], device='cuda:0')
Image: TorchDrug

Graphs also support a wide range of indexing operations. Typical usages include applying node masking, edge masking or graph masking. The optimiser can be used for parameters in the task and combine everything into the core. The engine provides convenient routines for training and testing. To test the model on the validation set, it only takes one line.

TorchDrug is designed to cater to all kinds of development. This ranges from low-level data structures and operations, mid-level layers and models, to high-level tasks. One can easily customise modules at any level with minimal effort by utilising building blocks from a lower level.

Image: TorchDrug

The correspondence between modules and the hierarchical interface is :

  • Graph data structures and graph operations; e.g. molecules.
  • torchdrug.datasets: Datasets; e.g. QM9.
  • Torchdrug.layers: Neural network layers and loss layers; e.g. message-passing layer.
  • Torchdrug.models: Representation learning models; e.g. message passing neural network.
  • torchdrug.tasks: Task-specific routines; e.g. molecule property prediction.
  • Torchdrug.core: Engine for training and evaluation.

Machine learning for drug discovery is a fast-growing area, and the PyTorch team expects that TorchDrug could help more and more people get involved in this interdisciplinary area. To learn more about TorchDrug, you can check out the Colab tutorials for basic usage and several drug discovery tasks using the link here.

More Great AIM Stories

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM