How Artificial Intelligence Is Reviving Proteomics

Proteomics and artificial intelligence

Proteomics is a field of study that deals with the analysis of the protein component of a cell or a tissue under a set of defined conditions. It is used to detect protein expression patterns under a particular stimulus and determine the functional protein networks at a cell or tissue level. Proteomics has major applications in medicine and drug development.

Over time, Proteomics has grown into a leading method for identifying and characterising proteins, thanks to the copious amount of genomic sequence data available today. The developments in mass spectrometry, protein fractionation techniques and bioinformatics have kicked Proteomics to the next level.


Sign up for your weekly dose of what's up in emerging technology.

Proteomics involves:

  • A method to fractionate or separate complex protein or peptide mixtures
  • Using mass spectrometry to acquire data that is necessary for identifying individual proteins
  • Bioinformatics for analysing and assembling the mass spectrometry data

Mass Spectrometry-Based Proteomics

The field of ‘omics’, which includes genomics, proteomics, and metabolomics, has been a game-changer in personalised medicine and healthcare. In the case of proteomics (which deals with proteins of abnormal genes), protein profiling through novel biochemical mass spectrometric methods can help identify and classify thousands of proteins.

Mass spectrometry is an analytical method to characterise biological samples. Due to its targeted, nontargeted, and high throughput abilities, it is a highly preferred method in proteomics. Mass spectrometry generates large datasets requiring the application of informatics approaches such as machine learning techniques, to analyse and interpret discrete data. Machine learning techniques can be applied in two ways, as noted in the paper titled ‘Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology’:

  • Directly on the mass spectral peaks
  • On the proteins identified by sequence database searching

A huge range of proteins can be identified from the analysed samples using mass spectrometry and machine learning techniques. The techniques have been pivotal in biomarker discovery for different types of diseases. The method has an obvious advantage over two-dimensional gel electrophoresis, enzyme-linked immunosorbent assays (ELISAs), protein arrays, affinity separation etc.

Recent Developments in Use Of AI/ML In Proteomics

Mass spectrometry, in its conventional form, poses some challenges in effectively and correctly recognising protein patterns. The technique doesn’t measure protein directly. It analyses smaller parts consisting of amino acid sequences with up to 30 building blocks. The measured spectra of these sequences are then compared with the database and assigned to specific proteins. Since the evaluation software is only a part of the spectra for comparison, certain proteins are not recognised correctly and completely.

Recognising this, a team from the Technical University of Munich successfully used proteomic data to train neural networks in a way that it was able to recognise protein patterns quickly and with almost no error. The AI software, called Prosit, is a breakthrough in proteomics research. Prosit is trained on 100 million mass spectra and can be used for all common mass spectrometers without additional training.

Apart from optimising mass spectrometry for proteomics, AI is also useful in speeding up massive datasets analysis. Conventional methods such as microscopy and fluorescence resonance energy transfer (FRET) techniques call for a high level of expertise. Researchers from the Novo Nordisk Foundation Center for Protein Research and the Niels Bohr Institute have developed a machine learning algorithm to quickly recognise protein patterns, allowing the classification of data sets in mere seconds.

Wrapping Up

Proteomics’ study is crucial for early diagnosis, prognosis, and monitoring of diseases as fatal as cancer. It also plays a vital role in drug development. One major challenge in this field of study is that proteome or the set of proteins in a cell/tissue/organism fluctuates from time to time. In such events, artificial intelligence and machine learning techniques can prove helpful in quick and accurate protein pattern recognition and classification.

More Great AIM Stories

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.