According to a report published last year, the cost of finding a new medicine is a whopping $2.5 billion, which is up from $1.8 billion in 2010. A popular drug discovery methodology is to use virtual screening where computer programs are used to find a suitable compound that reacts in a way that is desired as chemicals have therapeutic properties. Combing through a large database of chemicals is a daunting task. There is a challenge of massive dimensionality when it comes to analysing structures in drug design, and this is where machine learning comes into the picture.
Machine learning for drug discovery typically involves a training set with known active and known inactive compounds and then finding the probability that a compound is active and then ranking each compound based on its probability of being active.
In order to explore how machine learning can help accelerate the drug discovery process, Google teamed with pharma partners and demonstrated a novel virtual screening method that uses Graph Convolutional Networks(GCNN).
Innovating existing “virtual screening” methods to find potential molecules computationally rather than in a lab is an active area of research. However, the challenges here is to build a method that works well enough across a wide range of chemical space to be useful for finding small molecules with physically verified useful interaction with a protein of interest, i.e., “hits”.
Overview Of The Model
In a recently published work, titled, “Machine learning on DNA-encoded libraries: A new paradigm for hit-finding”, Google’s Accelerated Science team collaborated with X-Chem Pharmaceuticals to demonstrate an effective new method for finding biologically active molecules using a combination of physical screening with DNA-encoded small-molecule libraries and virtual screening using a graph convolutional neural network (GCNN).
This research, in turn, has led to the creation of an initiative called Chemome, a cooperative project between Google and ZebiAI to enable the discovery of many more small molecule chemical probes for biological research.
A chemical library is a collection of stored chemicals with associated information such as the chemical structure, purity, quantity, and physiochemical characteristics of the compound.
The physical part of the screening process uses DNA-encoded small molecule libraries (DELs), which contain many distinct small molecules in one pool, each of which is attached to a fragment of DNA serving as a unique barcode for that molecule.
Given the physical screening data returned for a particular protein, the researchers built an ML model using a graph convolutional neural network to predict whether an arbitrarily chosen small molecule will interact with a certain protein. The physical screening with the DEL provides positive and negative examples for an ML classifier where the small molecules that remain at the end of the screening process are positive examples, and everything else is negative examples.
Unlike many other uses of virtual screening, the process to select the molecules to test was automated or easily automatable given the results of the model without the need for a trained chemist.
The Chemome Initiative is expected to efficiently deliver new chemical probes to the research community for thousands of human proteins of interest and ultimately apply the algorithms to further understanding of disease pathways.
“This breakthrough will enable significant new biological discoveries and ultimately accelerate [the] discovery of new therapeutics to treat intractable diseases.”Rick Wagner, Founder and Director of ZebiAI
Today, the biopharmaceutical industry is facing unprecedented challenges to its fundamental business model and might not sustain without sufficient innovation. Declining R&D productivity is arguably the most important challenge the industry faces and R&D investments, both financial and intellectual, must be focused on the transformation from a traditional biopharmaceutical towards initiatives such as Chemome will spur significant new biological discoveries and ultimately accelerate new therapeutic discovery for the world.
Know more here.