Last updated February 28, 2024
In Innovation in AI

Why Stability AI Trails Behind RunwayML

Stability AI’s new animation SDK is trying to stand up to RunwayML.

Published on May 15, 2023
by Anirudh VK

Listen to this story

Of late, RunwayML has enjoyed somewhat of a runaway success with the trend of AI-generated videos growing exponentially. From pizza commercials to mockups of early 2000s home videos and short films, text-to-video is quickly becoming the new paradigm of generative AI.

In line with the trend, Stability AI has released a new SDK for Stable Diffusion that will allow for the creation of animations. With the SDK, users can prompt with just text, an image with text, or a video with text to create output animations. What began with Meta’s Make-A-Video, has now become the new frontier of generative AI algorithms. However, a few key players are suspiciously missing from the lineup.

Too little, too late

The new release by Stability AI is a software development kit which works with Stable Diffusion 2.0 and Stable Diffusion XL. The SDK has the capability to influence the output through a variety of parameters — from general purpose parameters like style presets, cadence, and FPS (frames per second) to more in-depth parameters to influence characteristics like colours, 3D depths, and post processing.

While this SDK is a good step forward for Stability AI, it seems they are late to the party. Similar solutions, built on Stability’s own models, have existed in the market for a while now. Deforum, an online community of AI image creators and artists, has created a web demo for text-to-animation. However, Deforum is fairly basic, as it just melds similar images generated by SD into each other, creating the illusion of an animation.

The true competitor to the Stable Animation SDK is RunwayML’s Gen-2, a text-to-video service. This new model, whose paper is yet to be released, builds upon Gen-1’s capabilities of style transfer and video modifications to generate video from just a text prompt. Similar to the Stable Animation SDK, users can use a text, images, or videos as a prompt to generate videos from scratch.

While RunwayML’s Gen-2 can only be accessed through a waitlist, it is a complete product which can be used without any technical knowledge. The Stable Animation SDK, on the other hand, is targeted at developers who wish to multiply the capabilities of Stable Diffusion’s models.

Even as video generation is emerging as the next big genAI technology, it seems that many of the companies that capitalise on text-to-image are nowhere to be found.

RunwayML: The new DALL-E?

Early last year, OpenAI released DALL-E 2, an image generation algorithm, which kickstarted a wave of innovation. Then came Midjourney, Stable Diffusion, Imagen, and more, catapulting generative AI into the mainstream. However, with the innovations surrounding text-to-video, a lot of these companies have stayed silent, especially OpenAI.

With the release of ChatGPT, and subsequently GPT-4, it seems that OpenAI is content with grooming its golden goose. As such, we have not seen any improvements to DALL-E, apart from its integration into Bing Chat. There is also no talk about any text-to-video model from the AI giant, counting it out of the newest wave of innovation.

Midjourney has also not provided any information on possible text-to-video algorithms, instead choosing to focus on increasing its market lead by adding new features to their image generator. However, it seems that research is leading to innovation, as it did just before the explosion of text-to-image models.

Meta’s AI research wing released a paper in September last year that detailed the approach to generating video without the need for text-video data pairs. Similarly, ByteDance, the company behind TikTok, also released a research paper harnessing the power of diffusion models to generate videos. While both these models have not been released to the public, research shows that the idea behind these approaches are sound —— backed up by the variety of generated videos on their websites.

Google, in collaboration with the Korea Advanced Institute of Science and Technology, followed suit with a paper on projected latent video diffusion models. However, this paper was also published with code, allowing for the replication of this approach. Building on the concept of feature-to-video diffusion models, a team from Alibaba released ModelScope on HuggingFace, which is open for all to use. This is the only service, apart from Deforum, that is open for use.

While the text-to-video market is still in its infancy, the AI-generated commercials show but an inkling of what is possible with video-generating algorithms. Meta has also released a set of generative AI tools targeted at advertisers on the platforms, so it is not implausible to think that Make-A-Video can be integrated into this in the future. Just as with any generative AI solution, the potential for innovation is boundless.

Access all our open Survey & Awards Nomination forms in one place >>

Anirudh VK

I am an AI enthusiast and love keeping up with the latest events in the space. I love video games and pizza.

Why Stability AI Trails Behind RunwayML

Anirudh VK

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

KissanAI Releases Dhenu Llama 3, an Indic LLM for Farmers

Enhancing AI Integration through Optimal Data Management in the Global Convenience Food and Beverage Sector

Is it Humane to Bash Humane Ai Pin?

Meta Llama 3 Now Available on Databricks For Enterprise

How Databricks is Enabling Agriculture’s Data Revolution with UPL

How Good is Llama 3 for Indic Languages?

OpenAI Hires Pragya Misra As Its First Employee in India

Meta Forces Developers Cite ‘Llama 3’ in their AI Development

India is Making its Own AI Servers

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.

AIM Launches the 3rd Edition of Data Engineering Summit. May 30-31, Bengaluru