Generative AI continues to transform so many facets of content creation. With Google DeepMind’s recent release of Veo3, everybody from marketers to creative professionals can harness the power of generative AI for video production. The technology represents a leap forward in how we think about producing video content quickly, intelligently, and with astonishing realism.
As Veo3 just hit the market in May of this year, it begs the question: has the tool reached the level of maturity required for professional-quality videos? Read below to find out.
What Is Veo3 and How Does it Work?
At its core, Veo3 is a tool that enables users to generate 8-second, high-quality videos using just a text or image prompt. Within one to two minutes, users receive a visual scene that comes complete with synchronized audio, such as voice, music, or ambient sound.
Now, how does it do this, exactly? It starts with using a large language model (LLM) to understand the text and/or image prompt that a user inputs through the Gemini App, Google Vids, or Flow—Google’s new scene-building tool. From there, a diffusion model is employed to generate the video frames with realistic motion and physics. Finally, an audio generation model creates the high-quality accompanying sound to pair with the video.
Veo3 in Action: Google Vids vs. Gemini vs. Flow
There are three main Google tools where Veo3 can be used to make AI-generated videos: Google Vids, Gemini, and Flow—all of which offer unique benefits:
- Google Vids is tailored for the enterprise and small business community and allows users to create, write, produce, edit, collaborate, and share videos all through the platform.
- Gemini is ideal for quick, high-quality outputs and almost instant downloads.
- Flow offers more advanced scene-building features, making it ideal for complex visual storytelling. With Flow, users can create visual jumps, scene transitions, and extend scenes.
For professionals, Google Vids and Flow often produce the highest quality videos. For everyday use of Veo3, Gemini is a great go-to due to its download speed.
Breaking Down the Pros and Cons of Veo3
It’s truly remarkable that generative AI has reached a whole new sophistication level where it can create realistic video clips from a single sentence prompt, with audio no less. However, Veo3 is still very much in its infancy and therefore has some weaknesses.
But first, let’s start with its strengths:
- Production and download speeds are extremely fast.
- The LLM has a strong understanding of simple prompts, often resulting in accurate visuals and voices—especially when using Veo3 in Flow.
- Scene-building capabilities are intuitive and user-friendly.
Weaknesses of Veo3 include:
- Digital humans featured in video clips lack diversity–-this is often the case in wide shots that feature groups of people. They often are dressed the same, with similar hair and skin color.
- Text-on-screen is still a huge weak spot for Veo3, and all AI video creation tools. Spelling errors are prevalent in text featured on-screen. Without going into too much technical detail, this is due to the way LLMs treat text in video. LLMs interpret test as part of the visual pattern and overall composition rather than semantic language.
- Unlike other generative AI tools that allow you to refine the initial result, Veo3 will create an entirely new video rather than changing one aspect of a scene. For example, if you ask Veo3 to change a person’s shirt color from green to blue, it will create an entirely different scene with a new digital human rather than adjusting the original shirt color. Similarly, Veo3 also struggles to maintain consistency across a series of clips, even when using the same prompt.
- While Veo3 generates speech and dialogue, controlling nuanced details like timing, emotion, and tone is very limited.
- There is a cap on the number of Veo3 videos that can be made per day.
Our take? Veo3 hasn’t reached the necessary quality level quite yet to be used for public-facing video content due to the weaknesses outlined above. However, just like other generative AI tools, we’ll likely see enhancements in the quality of Veo3 videos soon. In the meantime, it’s worth trying out the tool to understand how to use key features and experiment with prompting functions.
The Merritt Group team can help you with your video production needs to take your marketing content to the next level. Visit our MG Studio page to learn more.