Over the last six months, the GenAI image generation one-upmanship has been on boil. First, Google released Imagen 3, available in Gemini, in August 2024, and more recently, ChatGPT added image generation (based on Dall-E) to its 4o model in March 2025. The release of GPT-4o even led to a viral trend of users creating Studio Ghibli-style cartoon images that are still causing an uproar.
At MG Labs, we’ve been extensively testing both tools, particularly for creating stock-style B2B and B2G images. Here’s what we found.
GPT-4o vs. Imagen 3: Strengths and weaknesses
Over the course of our trials, we’ve found that Imagen 3 slightly outperforms GTP-4o when it comes to creating B2B stock images (Ghibli-style images notwithstanding). Imagen 3 also typically renders the images faster, in our experience. However, Imagen 3 struggles with complex prompts, which GPT-4o is much more adept at.
There are also several specific limitations across both platforms, including:
- “AI glow” that keeps most images from looking truly photorealistic
- Challenges with text rendering, which can be garbled and contain misspellings
- Inaccuracy with prompt interpretation, where details don’t exactly end up the way the user intended (despite multiple attempts)
- Inability to accurately create complex scenes or intricate detail (which is an area where Gemini particularly struggles, relative to ChatGPT)
Below is a chart comparing the general strengths and weaknesses of GPT-4o vs. Imagen 3 image generation.
Limitation | ChatGPT-4o | Imagen 3 (Gemini) |
Photorealism | Occasionally lacks true realism (i.e., “AI glow”) | Struggles with accurate proportions |
Text rendering | Improved but still error-prone | Often unreadable or distorted |
Interpretation of complex prompts | Tends to be literal | May misrepresent scene elements |
Detail accuracy | Generally high | Issues with fine details |
With that context, we thought it would be fun to show a few apples-to-apples visual results from the two tools, along with a stock image original to compare them to. We focused on generating tech-relevant, marketing-style images that B2B and B2G customers could use for campaigns, websites, and other common projects.
In the table below, you’ll see three images. The one on the far left is the stock image, which was the inspiration for the prompts that led to the AI-generated images. The generated images are shown side by side; GPT-4o in the middle and Imagen 3 in the right column.
As you can see, the results between GPT-4o and Imagen 3 vary depending on the image and its contents. The prompts were the same in both cases, so it came down to how the models interpreted the prompts and generated their version of the image.
Testing AI image prompt generation for yourself
Before we end, we are leaving you with a useful prompt for image generation. While we didn’t use this full prompt for all the images you see above, as each prompt needs to be customized for various image types and applications and some LLMs deal with specific asks better than others, this should give you a good starting point to generate very specific images that can be used with or as an alternative to stock images.
“Create a high-quality image of [main subject or scene], located in [setting or environment], captured from a [camera angle or framing, e.g., aerial view, close-up, over-the-shoulder] perspective.
The subject(s) should be [specific emotion or action, e.g., laughing joyfully, reaching for an object, standing confidently], and the overall mood should be [atmosphere or lighting style, e.g., intense contrast, cozy indoor lighting].
Render the image in [art style or visual influence, e.g., oil painting, cyberpunk, anime, photorealism].
Include [key elements, objects, characters, or background details] that define or enrich the scene.
Generate [number] image variations that differ by [aspects to vary, e.g., background, facial expression, clothing, weather].
Negative prompt: Avoid [elements or styles to exclude, e.g., blurry details, text, logos, specific colors, certain objects].
Final image should be sized at [dimensions or aspect ratio, e.g., 1024×1024, 4:5, 1920×1080].”
Want to learn more about the latest advancements in AI-powered tools and solutions? Check out the MG Labs page, where we publish our testing insights and share the AI news you need to know.