Last updated August 2, 2023
In Innovation in AI

Working with Generative AI Just Got Faster!

With Monster API users can access powerful generative AI models without the hassle of managing GPU infrastructure or breaking the bank.

Share

Published on August 2, 2023

by K L Krithika

Listen to this story

Open source models (Falcon, Llama, Stable Diffusion, and GPT J) are not easy to work with, it gets even more complicated when you have to test all of them to fit your requirements and specific use cases, and it’s definitely an expensive affair.

But, not anymore.

“You can now test Llama 2 in less than 10 minutes,” said AI expert Santiago, introducing Monster API, a new tool that lets you effortlessly access powerful generative AI models such as Falcon, Llama, Stable Diffusion, and GPT J and others, without having to worry about managing the generative AI models or scaling them up to handle lots of requests.

Santiago said that he has been working with the Monster API platform for a while now, and seems to be impressed with the level of accessibility it provides to open source generative AI models. “They take care of the GPU infrastructure, containerisation, Kubernetes clusters, scalability, etc,” he added that you only need to focus on your code integration.

Further, he said that it leverages a distributed GPU network, so that users can access these models at a fraction of the cost.

Here is the source code of this example: https://t.co/BTWnlaraWu.

You can also join @monsterapis’ Discord server for the latest updates, free credits, and special offers: https://t.co/A7axoM858K.

Thanks to the team @monsterapis for partnering with me on this post.
— Santiago (@svpino) July 31, 2023

Decentralised GPU

Founded by brothers Gaurav Vij and Saurabh Vij in June 2023, Monster API uses idle computing power of millions of decentralised crypto mining rigs worldwide and optimises them for machine learning and packages them with popular generative AI models.

In other words, it uses distributed computing to bring down the cost of training a foundational model. The mining of bitcoins needs high levels of compute deployed on GPUs. Now, the interest in crypto is in decline and many of these devices are gathering dust. Gaurav Vij, founder of QBlocks said, “We eliminate the need to worry about GPU infrastructure, containerization, setting up a Kubernetes cluster, and managing scalable API deployments as well as offering the benefits of lower costs. One early customer has saved over $300,000 by shifting their ML workloads from AWS to Monster API’s distributed GPU infrastructure,”

His company provides a decentralised GPU network at up to 10x more affordable rates to data scientists, researchers, designers and developers. He said, “Most of the machine learning developers today rely on AWS, Google Cloud, Microsoft Azure to get resources and end up spending a lot of money.”

The artificial intelligence world is struggling to match the hardware in computing power. Demand has outstripped supply, says Saurabh, founder and CEO of Monster API. He further explained, “You can take a pre-trained foundational model; you can take datasets from free datasets like Hugging Face and quickly start fine-tuning these foundational models for your custom dataset. This can be done for under 30 to 40 dollars instead of hundreds of dollars, which you otherwise could spend on fine-tuning these models.” The company has cut fine-tuning costs by up to 90% through optimisation, with fees around $30 per model.

Their website also provides information for developers to build no code fine-tuning of large language models, build over Llama2, Alpaca, Falcon 7B, Stable LM 3B and more.

Other Similar Platforms

Monster API is not alone. There are several tools, including the likes of Gooey.AI, Illusion AI etc, that are built on frameworks like PHP, Python and Java. H20.ai are now cropping up around these models acting like the middle man between the models and the user. They take care of all the more difficult processes like providing on demand access to a pool of GPUs, reducing the cost of training and refining, providing APIs for natural language technologies and computer vision applications etc. They make fine tuning accessible for a wider audience through their no-code user interface. They do this through a visual or graphical interface making it possible for people to take advantage of state-of-the-art models.

H2O recently introduced a driverless AI as a tool. Driverless AI automates complex data science and machine learning tasks, including feature engineering, model validation, tuning, selection, and deployment. It does things like selecting the best features, fine-tuning the models, and creating a simple and fast way to use the models in real-world applications. AT&T has scaled the adoption of H2O driverless with more than 380 employees using Driverless AI across 80 business units saving the company money.

This is only two examples of an entire ecosystem providing no-code/low-code platforms for Generative AI models. There are considerable drawbacks in these platforms. Businesses are under pressure to deliver applications faster, and these options are cost effective but the security and scalability of these platforms are limited as of now. A number of open source projects and companies that are constantly improving on these issues like Alteryx, Kinme, Dataiku etc.

Access all our open Survey & Awards Nomination forms in one place