MITB Banner

What is IBM’s GT4SD?

Our goal behind making the toolkit open-source was for research to progress faster in the domain of generative modelling.

Share

IBM GT4SD

As cliched it may sound, data is truly the oil of the 21st century. However, not all of this data is useable in its raw form, so the trick really lies in developing models and algorithms that can analyse and derive useful information from this data. One of the most promising ways to do this is through generative models. Generative models serve many short term and long-term applications. They hold the potential to automatically learn the natural features of a dataset – whether categories, dimensions or something else. They are being used as the starting point to design and discover new drugs, create solutions to challenging problems related to healthcare, sustainability, etc., and generate new knowledge.

For long, IBM has been working to accelerate innovation and has been fostering an open community around scientific discovery. In a blog, IBM mentions that it wanted to make the technology like AI available as a tool to carry out research quicker and more efficiently rather than something that would require a user to have very specific domain knowledge. To that end, the company recently launched the Generative Toolkit for Scientific Discovery (GT4SD). It is an open-source library for accelerating the hypothesis generation process in scientific discovery for easy adoption of state of the art generative models. GT4SD includes models for generating new molecule designs using properties like target proteins, target omic profiles, scaffolds distances, binding energies, and other targets relevant for materials and drug discovery.

What is GT4SD?

In an interview, the lead creator of GT4SD,  Matteo Manica, said that the motivation behind the development of GT4SD came from the urge to simplify access to AI technologies and the need to fill in generative models. To this end, Manica and his team focused on developing the technology quickly. In an effort that lasted 10-11 months, the team gathered the algorithm already developed at IBM Research and looked outside for additional things that might be required for the library.

The GT4SD library offers an environment for generating new inferences and for fine-tuning generative models for specific domains using custom data sets. It is compatible with popular deep learning frameworks such as PyTorch, PyTorch Lightning, HuggingFace Transformers, GuacaMol, and Moses, serving a range of applications, like drug discovery.

The library makes it easier to develop problem-specific intelligence through automatic workflows for retaining user data covering molecular structures and properties. This replaces the manual processes and in addition, the human bias in the discovery process, leading to an overall acceleration of expert knowledge. GT4SD’s common framework makes generative models easily available and accessible to a broad community, including AI practitioners who want to deploy their models with just a few lines of code. This library offers a centralised environment for scientists and students to access and explore a variety of pretrained models. It offers commands and interfaces for inferences and retraining with customisable parameters across different generative models.

GT4SD is available on GitHub. The short goal of IBM with GT4SD is to expand the toolkit’s portfolio and release new algorithms, frameworks and pretrained models. The ultimate goal is to build an open community of discovery to accelerate scientific discovery and create solutions for some of the most challenging problems in the world.

“Our goal behind making the toolkit open-source was for research to progress faster in the domain of generative modelling. For any company, revenue is obviously important, and it can be difficult to see how you can make revenue from an open-source project. But the main return we’re looking for here is to create a community of users and contributors that helps us to build better models and who we can help empower to build better models,” said Manica.

Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.