Key takeaways
- Extended token windows deliver greater consistency over time: GPT-5 shows memory function improvements in maintaining context and consistency in conversations, making it beneficial for extended marketing and PR projects.
- Hallucinations and image editing remain a challenge: Despite advancements, GPT-5 still struggles with factual accuracy and image editing, requiring human oversight for verification and refinement.
- GPT-5 works best with a human in the loop: GPT-5 is best utilized as a powerful tool for initial content generation and brainstorming, working as a co-pilot alongside human expertise to ensure accuracy and audience resonance.
As a Senior Technology Writer at Merritt Group who’s also a member of the MG Labs team, I was among the many AI enthusiasts eagerly awaiting the release of GPT-5. MG Labs frequently evaluates new AI tools and models to discover how they can help PR pros and marketers drive better results. So when OpenAI’s CEO, Sam Altman, first teased GPT-5 in November 2023, I was excited to put it to the test.
Fast-forward to August of this year, and OpenAI finally released GPT-5 after initially delaying the launch. The rollout has not been without its bumps, however.
It started when OpenAI abruptly removed earlier models without notifying users in advance. Many were upset about losing access to critical workflows they’d spent months fine-tuning. What’s more, GPT-5 failed to live up to the initial hype. Users frequently complained that GPT-5 was “too stiff and formal,” with the model often defaulting to shorter, more abrupt responses rather than the conversational, thoughtfully articulated outputs that GPT-4 is known for. It also made several errors when responding to prompts during a live-streamed rollout event. Despite subsequent updates to the model, GPT-5 continues to struggle to gain traction with users.
Rather than simply taking others’ word for it, I decided to go straight to the source and ask ChatGPT how GPT-5 was an upgrade from previous models.
ChatGPT’s take on the switch from GPT-4 to GPT-5
When I prompted ChatGPT to walk me through the latest updates to GPT-5, it identified eight key differences between the current model and GPT-4:
- Unified, smart model with auto-routing: GPT‑5 replaces the need for manual model selection. It uses a dynamic router to automatically choose between quick, high-throughput responses and deeper “thinking” reasoning based on the task, intent, or content complexity.
- Deeper reasoning and logical consistency: A built-in “thinking” mode allows GPT‑5 to engage in multi-step, structured analysis, improving its ability to solve complex problems compared to earlier GPT models.
- Better accuracy, fewer hallucinations: ChatGPT reports that GPT‑5 delivers more reliable results and fewer hallucinations (inaccurate but confident responses). It is also supposed to say “I don’t know” when uncertain to help boost transparency and trust.
- Significant coding and creative boosts: OpenAI has frequently touted GPT-5 as its strongest model yet for coding. According to ChatGPT, GPT‑5 can generate complex front-end code, debug large codebases, and even support “vibe coding” for intuitive app and web design.Writing is also improved to be more structured, clearer, and better suited for professional, persuasive, or branded content. Some users, however, find it less emotionally striking in creative contexts.
- Expanded context and multimodal capabilities: GPT‑5 supports long-form, coherent conversations—thanks to an extended token window (e.g., 256K context). It’s also natively multimodal, trained simultaneously on text and image data from scratch.
- Personalization and personality features: Users can now select from new “personalities” like Cynic, Robot, Listener, or Nerd to tailor tone and style. Additional personalization options cover voice tone, UI accent colors, and themes.
- Smarter tools and app integration: GPT‑5 bridges into real-world tools; it can integrate with Gmail and Google Calendar, enabling more personalized assistance and productivity support.
- Enhanced safety and alignment: The model integrates safer completion strategies—refusing risky queries while minimizing unnecessary denials for harmless ones. It’s less likely to be sycophantic, offering more nuanced responses.
In short, ChatGPT believes that GPT-5 brings a smarter, faster, and more intuitive user experience with deep reasoning, fewer hallucinations, long-context handling, native multimodal skills, rich personalization, seamless tool integration, and improved safety—all wrapped into a single unified model.
However, our experience testing GPT-5 revealed a different story.
Evaluating GPT-5 reveals familiar accuracy and image generation challenges
When testing GPT-5, I decided to start with a familiar use case: drafting SEO-optimized blog content. This provided GPT-5 with specific parameters to follow in terms of the SEO keywords and search queries that I wanted to target, while still evaluating its ability to accurately source information and create a compelling narrative.
More specifically, I asked GPT-5 to write an SEO-optimized blog post about Cybersecurity Maturity Model Certification (CMMC) compliance because the Department of Defense (DoD) is still finalizing the program. I wanted to see if GPT-5 could distinguish between outdated information that’s still available online and the most up-to-date DoD guidance. The resulting blog post was significantly shorter than I would have preferred, and it lacked the usual depth I was used to with GPT-4. More importantly, I noticed some key factual errors.
When I asked GPT-5 to describe the cybersecurity controls at CMMC Level 1, it incorrectly stated that there are 17 controls. This used to be true in earlier versions of the DoD’s CMMC Level 1 guidance, but the DoD condensed its list from 17 controls to 15 controls in September 2024 when it released CMMC version 2.13. Even when I told GPT-5 that its answer was incorrect, it was still unable to identify the correct list of controls. It also had to “think” for nearly two and a half minutes before generating this second, still incorrect response.
While GPT-5 is correct that CMMC Level 1 pulls its list of required controls from Federal Acquisition Regulation (FAR) Clause 52.204-21, version 2.13 of the CMMC Level 1 Assessment Guide clearly states that companies will only be measured against 15 controls at this level, not 17.
Next, I decided to put GPT-5’s multimodal capabilities to the test by asking it to create a photorealistic virtual background for our in-house podcast, Lay of the Brand. I provided the model with a PNG file of the Lay of the Brand logo and some real-life use case examples, and asked it to format the background in a 1920×1080 aspect ratio for YouTube.
Background 1 was a good start, but I wanted to move the logo to the top corner so it wouldn’t be directly behind our podcast host.
While the first image it generated wasn’t bad, GPT-5 placed the Lay of the Brand logo in the center of the image — meaning it would have been blocked by anyone using the background. For round two, I simply asked GPT-5 to move the logo to the upper right or left-hand corner. Instead, the model generated an entirely new image with a fake microphone that would have looked odd when used as a virtual background. I also didn’t like how the logo stood out against the rest of the image.
While GPT-5 was able to relocate the logo, I didn’t like how it lay directly against the left-hand side of the image with no spacing or shadowing to make it feel like a natural part of the background. I also worried that the microphone would be completely obstructed by anyone using the virtual background.
For my final attempt, I asked GPT-5 to go back to the drawing board and generate a new image in a 1920×1080 aspect ratio. I instructed GPT-5 to place the logo in the top right or left-hand corner, leave space along the bottom third of the image for closed captioning, and blend the logo seamlessly with the rest of the background. While it followed my instructions relatively well, the resulting image still included some details that would be obstructed by anyone using it as a virtual background.
The final image was much closer to what I wanted, but I still worried about the two chairs and microphones looking odd when used as a virtual background.
Final takeaways
Ultimately, I don’t believe that GPT-5 is a transformative update from previous models. It still produces inaccurate information that needs to be fact-checked, and I often have to use follow-up prompts to get the right level of depth for long-form content. Despite its native multimodality, GPT-5 also struggles to properly edit previously generated images — often opting to create a net-new image instead. Many of these problems are commonplace in GPT-4, making it difficult to meaningfully distinguish between the two models.
However, one area where GPT-5 excels is its ability to retain memory and maintain consistency across conversations. Multiple members of the MG Labs team have commented on GPT-5’s ability to recall information from previous conversations and retain that context when responding to follow-up prompts. The head of Merritt Group’s Government Practice even used GPT-5 to rewrite a recent Merritt Group eBook, “Navigating AI Search: Your Guide to Generative Engine Optimization (GEO),” for a government-specific audience. When grounded by strong source material or directions from previous prompts, GPT-5 can produce valuable outputs — likely thanks to its extended token window.
While GPT-5 may not have lived up to the sky-high expectations that accompanied its launch, it’s not without value. For marketers, PR professionals, and other business users, the model can accelerate brainstorming, provide structure for first drafts, and maintain context across longer projects — advantages that can save time and spark creativity. At the same time, its lingering accuracy issues, stiffness in tone, and difficulty refining creative outputs make it clear that the need for human expertise and ingenuity isn’t going away anytime soon.
Like many emerging AI tools, GPT-5 is best approached as a partner rather than a solution. Its strengths shine when paired with strong human direction, careful fact-checking, and creative oversight. Whether you view it as overhyped or simply misunderstood, one thing is certain: GPT-5 represents another step in the evolution of AI, and its true potential will depend less on the model itself and more on how we choose to use it.
To learn more about the latest and greatest AI-powered use cases for marketing and PR, be sure to check out our MG Labs page, where we share everything from tool evaluations to our perspective on emerging AI trends.