Listen to this story
|
In a recent interview with the Wall Street Journal, CTO Mira Murati shared that OpenAI will make its latest text-to-video generator, Sora, publicly accessible later this year. When Sora was launched earlier in February, users greatly appreciated its hyper-realistic videos, many calling it the “ChatGPT moment for video.”
The model, showcased in February, generates realistic scenes from text prompts and will soon be open for public use. The initial rollout will primarily target visual artists and filmmakers. Murati also disclosed plans to incorporate sound and editing flexibility into Sora-generated videos.
Sora was trained on publicly available and licensed data, including content from Shutterstock. In July, the company expanded its partnership with OpenAI, signing a new six-year agreement to provide high-quality training data.
However, she mentioned that given that Sora is “much more expensive”, the team is trying to make it financially viable, similar to its other products like DALL.E and ChatGPT.
The team behind Sora is led by Tim Brooks, a research scientist at OpenAI; William Peebles, a research scientist at OpenAI, and Aditya Ramesh, the creator of DALL·E and the head of Videogen.
OpenAI’s all-new super-cool, text-to-video tool can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. Unlike typical language models using text tokens, Sora utilizes visual patches, similar to LLMs, focusing on visual elements.
Sora is a stepping stone for AGI, as the company teaches the model to understand and simulate real-world interactions. Today marks one year of GPT-4. It is time for OpenAI to come up with GPT-5.