Listen to this story
Stability AI recently announced the launch of a new text-to-image generator, ‘Stable Diffusion’. The image-generating tool competes with the likes of DALLE-2, Midjourney, Imagen and others. Unlike other text-to-image models, Stable Diffusion is open-source and has no content filter.
Recently, the company has also launched the beta version of the platform, called DreamStudio. The platform allows users to login and create the first 200 images for free.
The model has been released under a creative ML OpenRAIL-M license. This is a permissive license that allows for commercial and non-commercial usage.
With Stable Diffusion, the company aims to empower billions of people to create stunning art within seconds. The team believes that it is a breakthrough in speed and quality. Meaning, it can run on consumer GPUs.
Tesla Director of AI, Andrej Karpathy, lauded Stable Diffusion, and claimed that that tool marks a day of historic proportion for human creativity.
Stable Diffusion is the brainchild of researchers at Stability AI, a London- and Los Altos-based startup. The researchers include Patrick Esser from Runway and Robin Rombach from Machine Vision & Learning research group at LMU Munich, who have previously worked on Latent Diffusion Models, alongside the support from communities at EleutherAI, RunwayML, LMU Munich, LAION, and others.
The company is also planning to build an alternative for PowerPoint.
About Stable Diffusion
The image-to-text generation model ‘Stable Diffusion’ has been built upon the work of the team at CompVis and Runway in their widely used latent diffusion model combined with insights from the conditional diffusion models by their lead generative AI developer Katherine Crowson, Dall-E 2 by Open AI, Imagen by Google Brain and others.
“We are delighted that AI media generation is a cooperative field and hope it can continue this way to bring the gift of creativity to all,” shared the team at Stability AI, in its blog post.
Stability AI’s founder Emad Mostaque said that it will continue to release faster and better models. “Not just in image, audio next month, then we move onto 3D, video,” added Mostaque.