Listen to this story
Google researchers recently came up with a new text-based image editing method called ‘Imagic’. Imagic uses an input image and a text prompt—describing the edit—to then produce an edited image as its output. This new method, they claim, is unlike any of their previous methods for it doesn’t require any additional inputs.
The semantic image editing method can perform a wide variety of edits including style changes, colour changes, and object additions while preserving the features of the original image. It is the first of its kind—allowing a complex semantic edit on a single input image. In one of their own examples, Google used different target texts such as, ‘a sitting dog’ and ‘a jumping dog’ to a single input image of a dog and successfully generated the respective edited image.
Check out the different prompts they used for multiple single input images here.
Sign up for your weekly dose of what's up in emerging technology.
Imagic utilises a pre-trained, text-to-image diffusion model. The diffusion models are capable of generating high-quality images that are in line with the prompted text. By producing a text embedding that aligns with the input image and target text along with fine-tuning the diffusion model later, it is able to maintain fidelity to the original image.
With its ability to implement different editing types to images across different domains, ‘Imagic’ has showcased quality and versatility in its method.