OpenAI chief Sam Altman brought Cred founder Kunal Shah’s imaginative prompt to life by creating a video featuring a bicycle race on the ocean using Sora.
Altman had a busy Thursday as OpenAI introduced its text-to-video generation model, Sora. Eager to engage with users, Altman took to X, urging them to contribute prompt ideas. These user-provided prompts served as the creative fuel for Altman, inspiring the videos he crafted.
Shah’s prompt read, “A bicycle race on the ocean with different animals as athletes riding the bicycles with a drone camera view.”
OpenAI’s Sora is designed to understand and simulate complex scenes, featuring multiple characters, specific motions, and intricate details of the subject and background. The model not only interprets user prompts accurately but also ensures the persistence of characters and visual style throughout the generated video.
One of Sora’s standout features is its ability to take existing still images and breathe life into them, animating the content with precision and attention to detail. Additionally, it can extend or fill in missing frames in an existing video, showcasing its versatility in manipulating visual data.
Sora builds on past research in DALL·E and GPT models. It uses the recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data.
While Sora’s capabilities are impressive, OpenAI acknowledges certain weaknesses, such as challenges in accurately simulating the physics of complex scenes and occasional confusion regarding spatial details in prompts.