MITB Banner

From Losing the AI Art Race to Winning It, Meta Says ‘Make A Video’

The company on Thursday announced Make-A-Video, a new AI system that turns text prompts into brief, soundless video clips

Share

From Losing the AI Art Race to Winning It, Meta Says ‘Make A Video’
Listen to this story

AI art tools are changing the idea of creativity and getting whackier every week. In a span of just a few years, AI art generators have gone from creating incomprehensible pictures to realistic content. Researchers at Meta AI just took a leap into generating art through prompts. The company on Thursday announced Make-A-Video, a new AI system that turns text prompts into brief, soundless video clips.

“Generative AI research is pushing creative expression forward by giving people tools to quickly and easily create new content,” Meta said in a blog post on Thursday. “With just a few words or lines of text, Make-A-Video can bring imagination to life and create one-of-a-kind videos full of vivid colours and landscapes.”

The Make-A-Video webpage includes short clips of home-video quality which look fairly realistic. These result from the prompt “A robot dancing at Times Square” or one meant to show “Hyper-realistic spaceship landing on mars”.

Apart from text-to-video generation, the tool can add motion to static images and also fill in the content between two images. Furthermore, one can also present a video, and Make-A-Video will generate different variations. Head to Make-A-Video’s web page to see more of what it can do. 

The Tech Check 

In his Facebook post, Mark Zuckerberg highlighted how difficult it is to generate videos than photos beyond correctly generating each pixel as the system also has to predict the changes over time.

The key technology behind Make-A-Video—and why it has arrived sooner than anticipated—is that it builds off existing work with text-to-image synthesis used with image generators like OpenAI’s DALL-E.

Instead of training the Make-A-Video model on labelled video data, Meta took image synthesis data and applied unlabeled video training data so the model learns a sense of where a text or image prompt might exist in time and space. Then it can predict what comes after the image and display the scene in motion for less than five seconds.

In the paper, Meta’s researchers note that Make-A-Video is training on pairs of images, captions and also unlabeled video footage. Training content was sourced from WebVid-10M and HD-VILA-100M datasets containing millions of videos of thousands of hours of footage. This includes footage created by sites like Shutterstock and scraped from the web. As per the paper, the model has several technical limitations beyond blurry footage and disjointed animation. For instance, the training methods cannot learn information that might only be inferred by a human watching a video.

What can go wrong?

Make-A-Video isn’t yet available to the public. The potential it holds is visible from the preview examples, but there are worrying prospects, as with every machine learning model. A California Democrat, Anna Eshoo, expressed some of those concerns, noting in a September letter that Stable Diffusion was used “to create photos of violently beaten Asian women and pornography depicting real people”.

The Meta research team preemptively scrubbed the Make-a-Video training dataset of any NSFW imagery as well as toxic phrasing. But the opportunity for misuse of Make-a-Video is not a small one. The output of these tools could be vastly used for misinformation and propaganda

Earlier this year, a group of researchers from Tsinghua University and the Beijing Academy of Artificial Intelligence (BAAI) released the only other publicly available text-to-video model named CogVideo. Unfortunately, the model has the same limitations as Meta’s recently released model.  

The tool is not available to the public yet, but you can sign up here to get on the list for any form of access later.

Share
Picture of Tasmia Ansari

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.