Choose your player – DALL.E 2 or Midjourney

If you require a more detailed, higher resolution image and are willing to spend a few dollars, MidJourney is definitely the way to go.
Listen to this story

A trend in the AI world that marked at least the first half of the year has to be text-to-image generation tools. Not just the tech world but everyone with a curious bone in their body rushed to check out these tools. While OpenAI’s DALL.E started it, soon the market was filled with similar tools – even giants like Google and Meta jumped in to introduce their own versions.

Today, we compare two of the most powerful text-to-image generators on the market – DALL.E 2 and Midjourney – with identical prompts and dive deep into what makes them unique.

The Technical Titbits

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

When OpenAI launched DALL·E 2 in April 2022, they changed how the world perceives AI art. It is a generative language model that can create stunning images from natural language instructions or contextual clues.

DALL·E 2 is a large model with 3.5B parameters, but not nearly as large as GPT-3 and, interestingly, smaller than its predecessor, DALL·E (12B). Despite its size, DALL·E 2 generates 4x higher resolution images than DALL·E, and it is preferred by human judges in caption matching and photorealism over 70 percent of the time. CLIP (for Contrastive Language-Image Pre-training) is one of the most important building blocks in the DALL·E 2 architecture, as it is the primary link between text and images.

OpenAI founder Sam Altman recently tweeted about making DALL·E 2 available to 1 million users. As part of this initiative, each user will receive 50 free credits during the first month of use and 15 free credits each month thereafter. Users can also buy credits on top of the free monthly credits for USD 15 to get 115 credit increments in the first beta phase. Each credit can be used to generate one original DALL·E 2  prompt or an edited or variation prompt. DALL·E 2 produces four images for each natural language prompt and three images for each edit and variation prompt.

On the other hand, Midjourney is from an independent research lab with the same name whose overarching mission is to “explore new mediums of thought.” They launched a text-to-image service in 2022, which, given a natural language prompt, generates visual depictions that are accurate to the description. 

Prompt: Titanic hitting the iceberg on a snowy night

Midjourney is an invite-only on-boarding system that sends and receives calls to AI servers via Discord. When a natural language query is issued, the bot returns four low-resolution images in roughly 30 seconds. At this point, you can generate variants and new generations to get closer to your desired ideation. You can change the aspect ratio of your text prompt with a maximum resolution of 2048×1280, while DALL·E 2 is stuck at 1024×1024 resolution.

Once you’ve dug down and found your preferred variant, you can upscale it and pull it down to your local machine. Midjourney, unlike DALL·E 2, combines CLIP with a constantly changing set of image generation methods.

Prompt: A bowl of soup that looks like a monster knitted out of wool
Prompt: An astronaut riding a horse in a photorealistic style

Prompt: Teddy bears mixing sparkling chemicals as mad scientists as a 1990s Saturday morning cartoon

Final Thoughts

Given that both these tools are “work-in-progress,” picking a winner might be difficult. DALL·E 2 is good at close-up photographs and discrete objects. It recognises a wide range of pop culture references, especially those in visual media or literary works with film adaptations. DALL·E 2 can create the most impressively high-quality charcoal or pencil sketches, paintings in the styles of various famous artists, and strange things like “medieval illuminated manuscripts.” 

It works especially well with art styles like “impressionist watercolour painting” or “pencil sketch,” which are more forgiving of flaws in the details. DALL·E 2 can create some absolutely stunning artwork with the right prompts and cherry-picking.

Midjourney can do all of the above and more. It’s exceptional at creating larger scenes. However, cracking the right prompt is perhaps the toughest part. 

Prompt: Wide angle aerial photograph; floating city of Shevat

In the end, it depends on what the user wants to do. If you require a more detailed, higher resolution image and are willing to spend a few dollars, Midjourney is definitely the way to go.

More Great AIM Stories

Sri Krishna
Sri Krishna is a technology enthusiast with a professional background in journalism. He believes in writing on subjects that evoke a thought process towards a better world. When not writing, he indulges his passion for automobiles and poetry.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM