PyTorch Releases Drug Discovery Platform “TorchDrug”

TorchDrug covers many recent techniques such as graph machine learning, deep generative models, and reinforcement learning.

PyTorch recently announced the release of its machine learning drug discovery platform TorchDrug to accelerate drug discovery research. The library is open-sourced and can be installed through pip if you have PyTorch and torch-scatter installed using 

pip install torchdrug, or through 

conda conda install -c milagraph -c conda-forge torchdrug.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

TorchDrug covers many recent techniques such as graph machine learning, deep generative models, and reinforcement learning. It also provides reusable training and evaluation routines for popular drug discovery tasks, including property prediction, pretrained molecular representations, de novo molecule design, retrosynthesis and biomedical knowledge graph reasoning. It is easy to build a prototype for one’s own dataset and application based on these techniques and modules.

For advanced users, the platform provides multiple levels of building blocks for different customisation demands. These include low-level data structures and operations (e.g. molecules and graph masking), mid-level layers and modules (e.g. graph convolutions and GNNs) and high-level task routines (e.g. property prediction). TorchDrug is flexible for all kinds of customisation. It also provides graph data structures and operations for manipulating biomedical objects, as well as reusable layers, models and tasks for building machine learning models.

The core data structures of TorchDrug are graphs, which can be used to represent a wide range of biological objects, including molecules, proteins and biomedical knowledge graphs. Visualisation API in the library can be used to check graph objects.

PackedGraph data structure, which builds a unified large graph and re-index each small graph in the batch, can be used to create a batch of variable-size graphs.

Code for calculating a batch of 4 molecules: 

mols=data.PackedMolecule.from_smiles(["CCSCCSP(=S)(OC)OC", "CCOC(=O)N", "N(Nc1ccccc1)c2ccccc2", "NC(=O)c1cccnc1"])
mols = mols.cuda()
# PackedMolecule(batch_size=4, num_nodes=[12, 6, 14, 9], num_edges=[22, 10, 30, 18], device='cuda:0')
Image: TorchDrug

Graphs also support a wide range of indexing operations. Typical usages include applying node masking, edge masking or graph masking. The optimiser can be used for parameters in the task and combine everything into the core. The engine provides convenient routines for training and testing. To test the model on the validation set, it only takes one line.

TorchDrug is designed to cater to all kinds of development. This ranges from low-level data structures and operations, mid-level layers and models, to high-level tasks. One can easily customise modules at any level with minimal effort by utilising building blocks from a lower level.

Image: TorchDrug

The correspondence between modules and the hierarchical interface is :

  • Graph data structures and graph operations; e.g. molecules.
  • torchdrug.datasets: Datasets; e.g. QM9.
  • Torchdrug.layers: Neural network layers and loss layers; e.g. message-passing layer.
  • Torchdrug.models: Representation learning models; e.g. message passing neural network.
  • torchdrug.tasks: Task-specific routines; e.g. molecule property prediction.
  • Torchdrug.core: Engine for training and evaluation.

Machine learning for drug discovery is a fast-growing area, and the PyTorch team expects that TorchDrug could help more and more people get involved in this interdisciplinary area. To learn more about TorchDrug, you can check out the Colab tutorials for basic usage and several drug discovery tasks using the link here.

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox