Listen to this story
|
Following Dall-E 2 and Midjourney, the deep learning model Stable Diffusion (SD) marked a leap forward in the text-to-image domain. Developed by Stability.AI, SD democratises text-conditional image generation due to its efficiency in running on customer-grade GPUs.
SD is amazing, but unfortunately, it isn’t trivial to set up (especially for people without good GPUs).
Here’s a list of tools built on SD with zero-technical skills needed!
OneClick Tools
Programmes that bundle SD in an installable programme, no separate setup and the least amount of git/technical skill needed, usually bundling one or more UI.
- Diffusion Bee
With a one-click installer, Diffusion Bee is a very simple way to run SD locally on M1 Mac. No dependencies or technical knowledge is needed. It runs locally on a computer; no data is sent to the cloud except requests to download the weights and check for software updates.
System Requirement(s):
- M1/M2 Mac
- 16 GB RAM is preferred as it will run slow with 8GB RAM
- MacOS 12.5.1 or later
Check the GitHub repository here.
- Stable Diffusion UI
Another one-click installer which provides a browser UI for generating images from text and image prompts. Just enter your text prompt, and see the generated image. Currently, it does not run on Mac.
System Requirement(s):
- Windows 10/11 or Linux. Experimental support for Mac is coming soon.
- NVIDIA graphics card, preferably with 4GB or more of VRAM. Without a compatible graphics card, it’ll automatically run in the slower “CPU Mode”.
- Minimum 8 GB of RAM.
Check the GitHub repository here.
- Charl-E
CHARL-E packages SD into a simple application. No complex setup, dependencies, or internet is required—just download and say what you want to see.
Check the GitHub repository here.
- NMKD Stable Diffusion GUI – AI Image Generator
An ML toolkit for text-to-image generation for your local hardware. As of right now, the programme only works on Nvidia GPUs (AMD GPUs are not supported).
System Requirement(s):
Minimum:
- GPU: Nvidia GPU with 4 GB VRAM, Maxwell Architecture (2014) or newer
- RAM: 8 GB RAM (Note: Pagefile must be enabled as swapping will occur with only 8 GB!)
- Disk: 12 GB (another free 2 GB for temporary files recommended)
Recommended:
- GPU: Nvidia GPU with 8 GB VRAM, Pascal Architecture (2016) or newer
- RAM: 16 GB RAM
- Disk: 12 GB on SSD (another free 2 GB for temporary files recommended)
Check the GitHub repository here.
- ImaginAIry
Pythonic generation of SD images with just pip install ImaginAIry. “Just works” on Linux and macOS (M1). Recent updates include memory efficiency improvements, prompt-based editing, face enhancement, upscaling, tiled images, img2img, prompt matrices, prompt variables, BLIP image captions, along with dockerfile/colab.
System Requirement(s):
- ~10 GB space for models to download.
- A computer with either a CUDA-supported graphics card or M1 processor.
- Preferably Python 3.10 installed.
- For macOS, rust and setuptools-rust must be installed to compile the tokenizer library. (Can be installed via: curl –proto ‘=https’ –tlsv1.2 -sSf https://sh.rustup.rs | sh and pip install setuptools-rust).
Check the GitHub repository here.
Web Distros
- Mage Space
Unfiltered SD for the text-to-image generation. The latest feature includes Image2Image, which lets you choose an image to combine with your prompt.
Check out the website here.
- Dreamlike.art
The website is currently completely free for several more days. If you run out of credits, go to the “Buy Credits” page and click “Buy”. You won’t be charged. The balance will be reset once we exit the beta test and add payments.
Check out the website here.
- FindAnything.App
Finding images through a search engine is difficult, and you may end up accidentally publishing copyrighted images or spending a lot of money to get the images you need.
The browser extension adds novel images alongside your Google image searches. You are no longer limited to a few options, as in the case for most stock images.
Check out the website here.
Major SD Forks
The following options allow you to make changes to a project without affecting the original repository. One can fetch updates or submit changes to the original repository with pull requests.
- Automatic1111 – SD Web UI
A browser interface based on the Gradio library for SD. Original text-to-image and image-to-image modes. One-click install and run script (but you still must install Python and git). The features include outpainting, inpainting, prompt matrix, Stable Diffusion upscale, and more.
Ensure the required dependencies are met and follow the instructions for both NVidia (recommended) and AMD GPUs.
Check the GitHub repository here.
- InvokeAI
This SD version features a slick WebGUI, an interactive command-line script that combines text-to-image and image-to-image functionality in a “dream bot” style interface, and multiple features and other enhancements. The version runs on Windows, Mac and Linux machines.
System Requirement(s):
- NVIDIA-based graphics card ~4 GB or more VRAM memory.
- An Apple computer with an M1 chip.
- ~12 GB Main Memory RAM.
- ~12 GB of disk space for the ML model, Python, and all its dependencies.
Check the GitHub repository here.
- Waifu Diffusion
Waifu Diffusion is a project based on CompVis/Stable-Diffusion. The Stable Diffusion model is fine-tuned on weeb stuff. A model trained on Danbooru (anime/manga drawing site) over 56k images.
System Requirement(s):
- ~30GB of VRAM is needed.
- ~30GB of storage if you don’t mind cleaning up every so often.
Check the GitHub repository here.
- Basujindal: Optimized Stable Diffusion
This repository is a modified version, optimised to use less VRAM than the original by sacrificing inference speed. To reduce the VRAM usage, the Stable Diffusion model is divided into four parts which are sent to the GPU when needed. Post calculation, they are returned to the CPU. The attention calculation is done in parts.
Check the GitHub repository here.