Seedance 2.0
ByteDance's flagship AI video model — #1 on Artificial Analysis Video Arena across all categories. 2K resolution, stereo audio, quad-modal input.
What's New in 2.0
Every major dimension upgraded — from resolution and duration to a fundamentally new multimodal architecture.
| FEATURE | 1.5 PRO | 2.0 |
|---|---|---|
| Max Resolution | 1080p | 2K |
| Max Duration | 12s | 15s |
| Audio | Basic sync | Stereo spatial |
| Image Refs | Limited | Up to 9 |
| Video Input | None | Up to 3 clips |
| Audio Input | None | Up to 3 files |
| Motion Quality | Baseline | 2x better |
| Speed | Baseline | 30% faster |
Seedance 2.0 introduces a unified Dual-Branch Diffusion Transformer that generates video and audio in a single pass — unlike 1.5 Pro's separate processing pipelines. This architectural shift enables stereo spatial audio, physics-aware motion, and seamless multi-shot narratives.
Key Features
Quad-Modal Input
Combine text, images, video clips, and audio files in a single generation. The @ Reference System lets you tag up to 12 files for precise creative control.
Stereo Spatial Audio
Native dual-channel audio with spatial positioning. Material-specific sounds, environmental acoustics, and phoneme-level lip sync in 8+ languages.
Director-Level Camera
Control dolly zooms, rack focuses, tracking shots, POV switches, Hitchcock zooms, and orbit movements through natural language or reference video.
Physics-Aware Motion
Training penalizes impossible motion for realistic weight transfer, momentum conservation, and natural body dynamics — even in complex choreography.
Multi-Shot Narratives
Generate coherent multi-shot sequences with natural cuts and transitions. Use the "lens switch" keyword to signal cuts while preserving character continuity.
Video Editing & Extension
Make targeted modifications to existing clips without full regeneration. Extend videos forward or backward, replace characters, and add or remove elements.
#1 Across All Categories
Seedance 2.0 leads the Artificial Analysis Video Arena — the industry's most comprehensive human-preference leaderboard for AI video models.
Prompt adherence and visual fidelity from text descriptions
Video quality with native audio co-generation
Animation quality from reference image input
Image animation with synchronized audio
success rate
generation time
1.5 Pro
How It Works
Write Your Video Prompt
Combine text with up to 9 images, 3 video clips, and 3 audio files using the @ Reference System. Tag references like @Image1 or @Video1 in your prompt to control characters, backgrounds, camera choreography, and audio rhythm all at once.
Choose Settings
Output at up to 2K resolution in clips from 4 to 15 seconds. Stereo spatial audio is generated natively — dual-channel with positional sound, material-specific effects, and phoneme-level lip sync in 8+ languages. Use the "lens switch" keyword to create multi-shot sequences with natural cuts.
Generate & Download
Standard clips finish in about 60 seconds with a 90%+ first-attempt success rate. Physics-aware motion keeps weight transfer and momentum realistic even in complex choreography. Download your MP4 with full stereo audio baked in.
Frequently Asked Questions
Start Creating with Seedance 1.5 Pro
While Seedance 2.0 integration is on its way, generate stunning videos today with our current models.
Try Seedance 1.5 Pro