MITB Banner

Why Do Companies Prefer Pre-Trained Models?

Share

Building a model from scratch requires a great deal of time and effort. When innovations are happening at a break-neck speed, concentrating all the efforts on building models could prove counterproductive. Apart from time and effort, building models could also set companies back financially, especially startups.

Enter pre-trained models. Put simply, it is a model created by a third party to solve a business problem. While it may not be 100 percent accurate, it saves time and effort in building a model from the ground up.

Advantages

Building a custom model involves collecting relevant training data, feature extraction, developing frameworks, creating an interface etc. Further, you might need data engineers, data scientists, platform engineers and business domain experts to build an efficient model. 

For fields such as cybersecurity, medicine, and autonomous vehicles, a custom model is a good idea as it gives a competitive advantage. But for most fields, it may turn out to be a long-drawn process that takes the focus away from the actual task.

Pre-trained models are generally supplied by a vendor and accessed by APIs as a service. The vendor will handle the training data, extraction features, hyperparameters, and frameworks. A business can utilise the vendor’s API to supply sample data to the pre-trained model. Pre-trained models require minimal setup and can be quickly integrated into the application. It is a low-investment cost model and allows the user to access them on a cloud platform via an API. Such models are usually trained using finely-tuned parameters resulting in high accuracy. 

For example, NVIDIA recently announced the release of its production-ready pre-trained models. The company said the pre-trained models would support conversational AI applications and Transfer Learning Toolkit 3.0.

Transfer Learning

A 2019 Dimension Research study revealed that up to 96 percent of organisations face training data problems, both in quality and quantity. A machine learning model requires up to 100,000 data samples to perform effectively, the study said.

Transfer learning takes the knowledge gained from a pre-trained model and applies it to another task in the same domain. One of the most recent and popular transfer learning examples is GPT-3, a language prediction model.

Tech companies such as Microsoft, IBM, AWS, and NVIDIA use transfer learning toolkits to eliminate the need for building models from scratch. The approach has also helped in addressing the quality and quantity challenges and has improved machine learning tasks.

Model-As-A-Service

With more and more startups cropping up, the demand for pre-trained models has escalated. Startups are looking for easy and cost-effective deployment of pre-trained models.

The popular marketplaces for pre-trained models include:

Amazon AWS Marketplace: AWS Cloud-based services provide models for purchase and deployment through its SageMaker software. This service labels and prepares data, chooses an algorithm, and trains and tunes it. This ‘marketplace’ includes algorithms for computer vision, speech recognition, image text, audio and video, and NLP.

ModelDepot: This machine learning marketplace provides pre-trained weights and allows the deployment of pre-trained image classification using the REST API. Under its free plan, the website offers training for up to 20,000 images but with no option to save the custom model. For the hosted plan, ModelML offers 1,000 trained images for $2. ModelDepot’s offerings are most suitable for image recognition, text recognition, and creating generative models.

Modzy: Modzy Marketplace offers access to pre-trained and retrainable models from leading tech companies. Every model available on the platform comes with a detail page containing information about model architecture, training data, validation data, and performance metrics.

BigML: It offers ML models for classification, regression, and time-series forecasting. BigML’s offerings are ideal for areas such as energy, financial services, IoT, and electronics. It has an ‘API-first’ approach as it delivers all features first to the REST API and has libraries for all popular languages.

Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.