State-of-the-art video and image generation with Veo 2 and Imagen 3

Earlier this year, we introduced our video generation model, Veo, and our latest image generation model, Imagen 3. Since then, it’s been exciting to watch people bring their ideas to life with help from these models: YouTube creators are exploring the creative possibilities of video backgrounds for their YouTube Shorts, enterprise customers are enhancing creative workflows on Vertex AI and creatives are using VideoFX and ImageFX to tell their stories. Together with collaborators ranging from filmmakers to businesses, we’re continuing to develop and evolve these technologies.

Today we're introducing a new video model, Veo 2, and the latest version of Imagen 3, both of which achieve state-of-the-art results. These models are now available in VideoFX, ImageFX and our newest Labs experiment, Whisk.

Veo 2: state-of-the-art video generation

Veo 2 creates incredibly high-quality videos in a wide range of subjects and styles. In head-to-head comparisons judged by human raters, Veo 2 achieved state-of-the-art results against leading models.

It brings an improved understanding of real-world physics and the nuances of human movement and expression, which helps improve its detail and realism overall. Veo 2 understands the unique language of cinematography: ask it for a genre, specify a lens, suggest cinematic effects and Veo 2 will deliver — at resolutions up to 4K, and extended to minutes in length. Ask for a low-angle tracking shot that glides through the middle of a scene, or a close-up shot on the face of a scientist looking through her microscope, and Veo 2 creates it. Suggest “18mm lens” in your prompt and Veo 2 knows to craft the wide angle shot that this lens is known for, or blur out the background and focus on your subject by putting "shallow depth of field" in your prompt.

Cinematic shot of a female doctor in a dark yellow hazmat suit, illuminated by the harsh fluorescent light of a laboratory. The camera slowly zooms in on her face, panning gently to emphasize the worry and anxiety etched across her brow. She is hunched over a lab table, peering intently into a microscope, her gloved hands carefully adjusting the focus. The muted color palette of the scene, dominated by the sickly yellow of the suit and the sterile steel of the lab, underscores the gravity of the situation and the weight of the unknown she is facing. The shallow depth of field focuses on the fear in her eyes, reflecting the immense pressure and responsibility she bears.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.
This medium shot, with a shallow depth of field, portrays an adorable cartoon girl with wavy brown hair and lots of character, sitting upright in a 1980s kitchen. Her hair is medium length and wavy. She has a small, slightly upturned nose, and small, rounded ears. She is very animated and excited as she talks to the camera and lighting and giggling with a huge grin.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.
The camera floats gently through rows of pastel-painted wooden beehives, buzzing honeybees gliding in and out of frame. The motion settles on the refined farmer standing at the center, his pristine white beekeeping suit gleaming in the golden afternoon light. He lifts a jar of honey, tilting it slightly to catch the light. Behind him, tall sunflowers sway rhythmically in the breeze, their petals glowing in the warm sunlight. The camera tilts upward to reveal a retro farmhouse with mint-green shutters, its walls dappled with shadows from swaying trees. Shot with a 35mm lens on Kodak Portra 400 film, the golden light creates rich textures on the farmer’s gloves, marmalade jar, and weathered wood of the beehives.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.
A low-angle shot captures a flock of pink flamingos gracefully wading in a lush, tranquil lagoon. The vibrant pink of their plumage contrasts beautifully with the verdant green of the surrounding vegetation and the crystal-clear turquoise water. Sunlight glints off the water's surface, creating shimmering reflections that dance on the flamingos' feathers. The birds' elegant, curved necks are submerged as they walk through the shallow water, their movements creating gentle ripples that spread across the lagoon. The composition emphasizes the serenity and natural beauty of the scene, highlighting the delicate balance of the ecosystem and the inherent grace of these magnificent birds. The soft, diffused light of early morning bathes the entire scene in a warm, ethereal glow.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.
A perfect cube rotates in the center of a soft, foggy void. The surface shifts between different hyper-real textures—smooth marble, velvety suede, hammered brass, and raw concrete. Each material reveals subtle details: marble veins slowly spreading, suede fibers brushing with wind, brass tarnishing in slow motion, and concrete crumbling to reveal polished stone inside. Ends with a soft glow surrounding the cube as it transitions to a smooth mirrored surface, reflecting infinity.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.
A cinematic shot captures a fluffy Cockapoo, perched atop a vibrant pink flamingo float, in a sun-drenched Los Angeles swimming pool. The crystal-clear water sparkles under the bright California sun, reflecting the playful scene. The Cockapoo's fur, a soft blend of white and apricot, is highlighted by the golden sunlight, its floppy ears gently swaying in the breeze. Its happy expression and wagging tail convey pure joy and summer bliss. The vibrant pink flamingo adds a whimsical touch, creating a picture-perfect image of carefree fun in the LA sunshine.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.
The sun rises slowly behind a perfectly plated breakfast scene. Thick, golden maple syrup pours in slow motion over a stack of fluffy pancakes, each one releasing a soft, warm steam cloud. A close-up of crispy bacon sizzles, sending tiny embers of golden grease into the air. Coffee pours in smooth, swirling motion into a crystal-clear cup, filling it with deep brown layers of crema. Scene ends with a camera swoop into a fresh-cut orange, revealing its bright, juicy segments in stunning macro detail.

Examples of Veo 2's high-quality video generation capabilities. All videos were generated by Veo 2 and have not been modified.

While video models often “hallucinate” unwanted details — extra fingers or unexpected objects, for example — Veo 2 produces these less frequently, making outputs more realistic.

Our commitment to safety and responsible development has guided Veo 2. We have been intentionally measured in growing Veo’s availability, so we can help identify, understand and improve the model’s quality and safety while slowly rolling it out via VideoFX, YouTube and Vertex AI.

Just like the rest of our image and video generation models, Veo 2 outputs include an invisible SynthID watermark that helps identify them as AI-generated, helping reduce the chances of misinformation and misattribution.

Today, we're bringing our new Veo 2 capabilities to our Google Labs video generation tool, VideoFX, and expanding the number of users who can access it. Visit Google Labs to sign up for the waitlist. We also plan to expand Veo 2 to YouTube Shorts and other products next year.

Note: Find prompts for all videos at the bottom of this post: Scientist¹, Cartoon character², Bees³, Flamingos⁴, Cube⁵, Dog⁶, Pancakes⁷

Imagen 3: state-of-the-art image generation

We've also improved our Imagen 3 image-generation model, which now generates brighter, better composed images. It can now render more diverse art styles with greater accuracy — from photorealism to impressionism, from abstract to anime. This upgrade also follows prompts more faithfully, and renders richer details and textures. In side-by-side comparisons of outputs by human raters against leading image generation models, Imagen 3 achieved state-of-the-art results.

Starting today, the latest Imagen 3 model will globally roll out in ImageFX, our image generation tool from Google Labs, to more than 100 countries. Visit ImageFX to get started.

Examples of Imagen 3's rich detail and image quality composition
Examples of Imagen 3's rich detail and image quality composition
Examples of Imagen 3's rich detail and image quality composition
Examples of Imagen 3's rich detail and image quality composition
Examples of Imagen 3's rich detail and image quality composition

Note: Find prompts for all images at the bottom of this post: Potter⁸, Squirrel⁹, Train station¹⁰, Woman¹¹, Strawberry bird¹²

Whisk: a fun new tool that lets you prompt with images to visualize your ideas

Whisk, our newest experiment from Google Labs, lets you input or create images that convey the subject, scene and style you have in mind. Then, you can bring them together and remix them to create something uniquely your own, from a digital plushie to an enamel pin or sticker.

Under the hood, Whisk combines our latest Imagen 3 model with Gemini’s visual understanding and description capabilities. The Gemini model automatically writes a detailed caption of your images, and it then feeds those descriptions into Imagen 3. This process allows you to easily remix your subjects, scenes and styles in fun, new ways.

10:25

Whisk lets you quickly visualize and remix ideas.

Whisk is launching in the U.S. today. Read more about Whisk and try it out at labs.google/Whisk.

POSTED IN:

Blog Article: Here

AscendantNews

AscendantNews

State-of-the-art video and image generation with Veo 2 and Imagen 3

Veo 2: state-of-the-art video generation

Imagen 3: state-of-the-art image generation

Whisk: a fun new tool that lets you prompt with images to visualize your ideas

Ascendant Bot

Related Posts

The AI opportunity for Europe’s climate goals

Bringing multimodal search to AI Mode

Leave a Reply Cancel reply

You Missed

The AI opportunity for Europe’s climate goals

Git turns 20: A Q&A with Linus Torvalds

Amazon Nova Reel 1.1: Featuring up to 2-minutes multi-shot videos

AWS Weekly Review: Amazon EKS, Amazon OpenSearch, Amazon API Gateway, and more (April 7, 2025)

The Changing Role of Developers in the Age of AI Agents

“Bazooka Innovation” for Professional Services: How Certinia and Agentforce are Driving Customer Success