Published on August 22, 2022
In AI Features

After Text-to-Image, Now it’s Text-to-Video

Text-to-video AI image generation is challenging because of high compute cost and lack of good datasets; however, researchers are now breaking these barriers.

By Pritam Bordoloi

When OpenAI announced DALL-E in 2021, the internet fell in love with the text-to-image AI generator. It helped AI become more mainstream. While its successor, DALL-E 2, is the most popular, there are other budding AI image generators such as Midjourney, Craiyon, and Imagen. But, the development in the text-to-video segment has faced several hurdles. The computation cost is exponentially higher for text-to-video generation, which makes the training from scratch nearly unaffordable. The lack of relevant datasets also adds to the problem. However, researchers across the globe are now slowly breaking these barriers. Let’s look at some of the most recent, noteworthy developments in this space. Stable Diffusion teams up with Runway Stable Diffusion is a new text-to-image generator launched earlier in August, 2022 and it is completely open source. In an interview with Yannic Kilcher, Emad Mostaque said, “DALL-E 2 was a fantastic experience, but Stable

Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

📣 Want to advertise in AIM? Book here

Pritam Bordoloi

I have a keen interest in creative writing and artificial intelligence. As a journalist, I deep dive into the world of technology and analyse how it’s restructuring business models and reshaping society.