Along with updates to the Gemini 2.5 lineup and upgrades for the Gemini app across all its subscription tiers, Google made several announcements regarding its generative media models last week at I/O 2025. Key highlights include Veo 3, which now supports audio generation capabilities and will be available to subscribers of Google's new Ultra plan, new capabilities for Veo 2, the introduction of Flow, the launch of Imagen 4, and expanded availability of Lyria 2.

Veo 3 Breaks New Ground with Audio Integration

Veo 3, Google's latest video generation model, is the first to allow users to generate videos with synchronized audio. This means that not only can users generate high-quality visuals, but they can also leverage Veo 3 to create synchronized audio for their creations, including ambient sounds, dialogue, and environmental noise. According to Google, Veo 3 excels at understanding complex prompts, allowing users to describe entire short stories that are then brought to life in video format with realistic physics and accurate lip-syncing.

Veo 3 is currently available to Ultra subscribers in the US through the Gemini app and Flow platform. Enterprise users can find Veo 3 on Vertex AI.

Enhanced Creative Control with Veo 2 Updates

Veo 2, the video generation model most customers have access to, is receiving an update that enhances it with some interesting filmmaker-focused features. These include the possibility of using images as references for character and scene consistency, precise camera controls for professional shots, outpainting capabilities for aspect ratio adjustments, and object manipulation tools for adding or removing elements from scenes.

Currently, the video generation based on references and camera controls features are only available through Flow (more below). They will be integrated into Vertex AI soon and into other products in the future.

Flow is an AI-powered video creation tool that integrates the Veo, Imagen, and Gemini models into a comprehensive platform. By leveraging the capabilities of some of Google's best media generation models, Flow allows creators to manage story elements—cast, locations, objects, and styles—in one interface. Then, creators can bring together their reference media with a natural language narrative of the scene they have in mind before allowing Flow to perform its magic.

Imagen 4 and Lyria 2 Round Out the Suite

Imagen 4 was designed to deliver images with an outstanding level of detail, like the one found on intricate fabrics, water droplets, and animal fur. According to Google, Imagen 4 handles the photorealistic and abstract styles best, and can generate images in a variety of aspect ratios, and up to 2K resolution. Imagen 4 also features improved spelling and typography support, unlocking use cases like printables and presentations. Imagen 4 is already available for all users across a range of products, including the Gemini app, Vertex AI, Whisk, and Workspace.

Lyria 2, an audio generation model that was first debuted powering Google's Music AI Sandbox, will now be available through YouTube Shorts for creators, and Vertex AI for enterprise customers. Additionally, LyriaRealTime will get its own API, in addition to its now being available in the Google AI Studio. LyriaRealTime generates music based on its outputs and user inputs, while being responsive to user controls. LyriaRealTime is also available through MusicFX DJ.

All new models include SynthID watermarking for content authenticity. Moreover, Google has launched a new portal, called SynthID Detector, aimed at helping people find out whether a file is completely or partially made with SynthID watermarked content.