Last updated January 24, 2024
In AI News & Update

Google Introduces Video Generation Model Lumiere, Leaves OpenAI Behind

It can animate still images, respond to natural language text prompts, and perform advanced video inpainting.

Share

Published on January 24, 2024

by Siddharth Jindal

Google has introduced Lumiere, a text-to-video diffusion model designed to synthesise videos, creating realistic, diverse, and coherent motion. Unlike existing models, Lumiere generates entire videos in a single, consistent pass, thanks to its cutting-edge Space-Time U-Net architecture.

It is designed to empower users to create visual content creatively, allowing the generation of realistic or surrealistic video clips up to five seconds in length.

Introducing Lumiere 📽️

The new video diffusion model we've been working on @GoogleAI

* Text-to-Video
* Image-to-Video
* Stylized Generation
* Inpainting
* Cinemagraphs
and more 🎨

W/ amazing team incl. @hila_chefer @omer_tov @InbarMosseri @talidekel @DeqingSun @oliver_wang2 pic.twitter.com/jEQcFo26Gm
— Omer Bar Tal (@omerbartal) January 24, 2024

It can animate still images, respond to natural language text prompts, and perform advanced video inpainting. It is built on a Space-Time U-Net architecture and a text-to-image (T2I) model operating in the pixel space, requiring a spatial super-resolution module for high-resolution image production.

Furthermore, Lumiere offers stylised generation, allowing it to generate videos in the target style using a single reference image. This is achieved by leveraging fine-tuned text-to-image model weights. The model can animate still images or portions of them, filling in missing areas with high-quality results.

Despite its limitations, such as not being designed to generate videos with multiple shots or scenes involving diverse motion, Lumiere represents a significant advancement in text-to-video AI generation. The project is currently a research project, and its release for broader use may be subject to addressing various policy considerations.

As of today, OpenAI does not have a publicly available video generation model on their API. However, they are actively researching and developing technology in this area, and there are hints that something might be in the works with the release of GPT-5.

Access all our open Survey & Awards Nomination forms in one place

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.