On January 4, 2026, OpenAI officially released the second generation of its video generation model — Sora 2.0. The main breakthrough is the model’s ability to generate not just video footage, but synchronized audio (voice, music, and sound effects) in a single rendering process.
Key Innovations:
- Hyper-Realism Audio: The neural network now automatically creates a soundtrack, precisely synchronizing characters’ lip movements with speech and layering ambient sounds (footsteps, wind, city noise).
- “Director” Mode: Users can now change camera angles and lighting in an already generated video without redrawing the entire scene.
- Generation Speed: Thanks to architecture optimization, a one-minute 1080p clip is now created in under 30 seconds.
Industry Reaction:
Experts call the Sora 2.0 release the “final nail in the coffin” for stock video services. It is expected that in 2026, up to 40% of advertising video content will be created without using cameras.
Source: OpenAI press release and demonstration at CES 2026.



