Google Flow launched in March 2026 by absorbing Whisk and ImageFX into a single AI creation workspace. Combined with the Veo 3.1 model update (native audio generation, start/end frame control, clip extension, 1080p), Flow is now the most integrated image-to-video platform available from any major provider.
What previously required three separate Google tools now lives in one interface. You generate an image, edit it with lasso-based natural language tools, convert it to video, extend the clip, add audio, and export. The lasso interaction model is genuinely new: draw a selection, type what you want changed, and Flow handles the rest.
Related: Compare video generation tools on AI Video Generator, explore image pipelines on AI Image Generator, or see how Flow fits in Best AI Video Tools 2026.
What Is Google Flow?
Flow is Google's unified AI creation workspace, available free through Google Labs. It consolidates three previously separate products:
- Whisk (image remixing and style transfer)
- ImageFX (text-to-image generation)
- Veo (text-to-video and image-to-video generation)
The core concept: you stay in one tool for the entire creation pipeline. Generate a base image from a text prompt. Refine it with targeted edits. Convert it to video. Extend the clip. Add camera motion. Generate an audio track. Export.
That matters because the typical AI video workflow involves bouncing between a separate image generator, a separate video generator, and sometimes a separate editor. Flow collapses those steps.
Flow runs in the browser. No download, no GPU requirement on your machine. Google handles inference server-side. Access is through labs.google/flow with a Google account.
Veo 3.1 New Features
Veo 3.1 shipped alongside the Flow workspace launch. The model upgrades focus on controllability and output quality rather than just generation speed.
| Feature | Description |
|---|---|
| Native audio generation | Veo 3.1 generates synchronized audio alongside video, not as a separate post-processing step |
| Start/end frame control | Specify exact start and end frames to control scene transitions and maintain visual continuity |
| Clip extension | Extend generated clips beyond initial duration, up to 8 seconds or longer through chaining |
| 1080p output | Full HD resolution output, up from previous 720p default |
| Physics-accurate simulation | Improved handling of gravity, fluid dynamics, cloth movement, and object interactions |
| Camera motion orchestration | Specify pan, tilt, zoom, dolly, and tracking shots through text prompts |
| Spatial audio | Audio output reflects spatial positioning of sound sources within the scene |
| LTX Studio API access | Veo 3.1 available through LTX Studio for developer and enterprise integration |
The native audio generation is the standout addition. Previous versions required running a separate audio model or manually adding sound. Veo 3.1 generates contextually appropriate audio: footsteps on gravel sound different from footsteps on tile, a car engine revs differently at idle versus acceleration.
Flow's Editing Tools
Flow's editing layer is what separates it from standalone generators. These tools work on both images and video frames.
Lasso selection + natural language editing
Draw a freeform selection around any object or region, then type a natural language instruction. Examples:
- Lasso a shirt, type "make it dark blue denim"
- Lasso the sky, type "dramatic sunset with orange and purple clouds"
- Lasso a face, type "add reading glasses"
The model interprets both the spatial selection and the text instruction to produce a targeted edit without affecting the rest of the scene.
Object add and remove
Describe what to add or remove from a scene without needing to select anything:
- "Add a golden retriever sitting on the left side of the frame"
- "Remove the car in the background"
- "Place a coffee cup on the table"
Flow handles object insertion with appropriate lighting, shadows, and perspective matching.
Camera motion control
Specify camera behavior through text prompts when generating or extending video:
- "Slow dolly forward toward the subject"
- "Pan left to right across the landscape"
- "Crane shot rising above the city"
- "Handheld tracking shot following the runner"
Camera instructions can be combined with scene descriptions in a single prompt.
Style transfer
Apply visual styles across frames or entire clips:
- "Apply film noir lighting with high contrast"
- "Shift to warm analog film grain, 1970s color palette"
- "Studio Ghibli watercolor style"
Style transfer maintains subject consistency while changing the visual treatment.
Clip concatenation
Chain multiple generated clips into a sequence within Flow. Each segment can have different prompts, camera angles, and styles. Flow attempts to maintain visual continuity between segments at transition points.
Workflow: Image to Published Video in Flow
This is a step-by-step walkthrough of the full creation pipeline within Flow, from a blank canvas to an export-ready video.
Step 1: Generate a base image
Start with a text prompt describing the scene you want:
A minimalist home office with a large window overlooking a rainy city skyline,
warm desk lamp, MacBook on a walnut desk, shallow depth of field, photorealisticFlow generates multiple variations. Select the one closest to your vision as the starting point.
Step 2: Refine with lasso edits
Use the lasso tool to make targeted adjustments:
- Lasso the window view, type "add neon signs reflecting in the rain"
- Lasso the desk surface, type "add a ceramic coffee mug and a small succulent plant"
- Lasso the lighting, type "warmer, more golden tone from the desk lamp"
Each edit preserves the rest of the image. Iterate until the frame matches your intent.
Step 3: Convert to video with Veo 3.1
Select the refined image and choose "Generate Video." Add motion instructions:
Gentle camera push-in toward the desk. Rain streaks slowly down the window.
Steam rises from the coffee mug. City lights flicker softly in the background.Veo 3.1 uses the image as the start frame and generates motion according to the prompt.
Step 4: Extend clip and add camera motion
If the initial clip is too short, use clip extension to continue the scene:
Continue the scene. Camera slowly pans right to reveal a bookshelf with warm backlighting.
The rain intensifies slightly. A car passes on the street below, headlights reflecting.Chain extensions to build longer sequences with evolving camera work.
Step 5: Generate audio track
Veo 3.1 can generate audio during video generation. For clips already created, add audio separately:
Soft rain on glass, distant city traffic, quiet lo-fi ambient music,
occasional thunder rumble far awayThe spatial audio system positions sounds to match visual elements: rain is louder near the window, the desk lamp hum is centered.
Step 6: Export for publishing
Export the final clip in 1080p. Flow provides direct download as MP4. For social media, select platform-specific aspect ratios (9:16 for Reels/Shorts, 1:1 for feed posts).
Google Flow vs Competitors
This comparison covers the major AI creation tools as of March 2026.
| Capability | Google Flow | Runway Gen-4 | Pika 2.1 | CapCut AI |
|---|---|---|---|---|
| Image generation | Built-in (ImageFX) | No (import only) | No (import only) | Basic templates |
| Video generation | Veo 3.1 | Gen-4 Turbo | Pika 2.1 | Seedance 2.0 |
| In-tool editing | Lasso + NL | Brush + keyframes | Motion brush | Timeline editor |
| Native audio | Yes (Veo 3.1) | No (separate step) | No | Music library |
| Unified workspace | Yes | Partial | No | Yes (different focus) |
| Max resolution | 1080p | 4K (upscaled) | 1080p | 1080p |
| Pricing | Free (Google Labs) | From $12/mo | From $8/mo | Free tier + Pro |
| Best for | End-to-end creation | Cinematic quality | Quick social clips | Edit-heavy workflows |
Flow's main advantage is the unified workspace: you do not need to leave the tool at any point from concept to export. The main limitation compared to Runway is that Runway still produces higher peak visual quality in standalone generation, and Runway offers more granular professional controls for detailed shot composition.
Compared to Pika and CapCut, Flow covers more of the pipeline but has a less mature export and publishing workflow. CapCut's timeline-based editing is still stronger for projects that need precise multi-track synchronization.
Veo 3.1 API Access
Developers can access Veo 3.1 through two paths:
Google Cloud / Gemini API: Veo 3.1 is available as part of Google's Gemini model family. Access requires a Google Cloud project with the Generative AI API enabled. Pricing follows Google's standard per-generation model, though exact rates have not been publicly finalized as of March 2026.
LTX Studio partnership: LTX Studio integrates Veo 3.1 as one of its available video generation backends. This gives developers access through LTX Studio's API, which adds storyboard-level orchestration on top of raw generation.
For teams already using the Gemini API for text or image tasks, adding Veo 3.1 video generation is a relatively small integration step. The API supports both text-to-video and image-to-video modes.
FAQ
Is Google Flow free?
Yes. Flow is currently free through Google Labs. Google has not announced pricing for a paid tier. Usage may be subject to daily generation limits, which vary based on demand and account standing.
How does Veo 3.1 compare to Sora?
Veo 3.1 and OpenAI's Sora target similar use cases but differ in integration. Veo 3.1 is embedded within Flow's unified workspace, which includes image generation and editing. Sora operates as a standalone video generator within ChatGPT. On raw video quality, both produce 1080p output with strong motion coherence. Veo 3.1's native audio generation is a feature Sora does not currently match.
Can I use Flow for commercial projects?
Google's terms for Labs products generally permit personal and experimental use. Commercial licensing for Flow outputs has not been explicitly detailed as of March 2026. Check Google's current terms of service before using Flow outputs in commercial production.
What happened to Whisk and ImageFX?
Both products were absorbed into Google Flow. Whisk's image remixing and style transfer features are available as Flow editing tools. ImageFX's text-to-image generation is Flow's image creation layer. The standalone versions are being phased out.
Does Flow support 4K output?
Not currently. Veo 3.1 outputs at 1080p maximum. For 4K, you would need to upscale externally using a tool like Topaz or Runway's upscaler.
Can I access Veo 3.1 via API?
Yes. Veo 3.1 is accessible through the Google Cloud Generative AI API and through LTX Studio's API integration. Both support text-to-video and image-to-video generation modes.
Related Articles
- Veo 3 Tool Page - Model specs and capabilities
- Best AI Video Tools 2026 - Full landscape comparison
- AI Video Pipeline Complete Guide - End-to-end production workflow
- Best AI Image Generators 2026 - Image generation tools for your pipeline
- Prompt Translator Tool - Optimize prompts across models

