Kuaishou Technology has launched Kling VIDEO 2.6, a breakthrough model that includes a "Native Audio" feature that enables the generation of video clips with audio in a single workflow. Like its competitor Veo 3, Kling VIDEO 2.6 eliminates the traditional workflow of creating silent footage and adding any audio effects separately.
The "Native Audio" feature supports two modes of generation: traditional text-to-audio-visual generation based on text prompts only, and image-to-audio-visual generation that accepts text prompts alongside reference images as input. Both generation modes enable users to create videos up to 10 seconds long.
Kling VIDEO 2.6 can generate videos with perfectly synced human voices, sound effects, and ambient audio in a single pass. The model supports Chinese and English voice generation and maintains a world-leading position in Chinese voice generation. From a more technical viewpoint, Kling Video 2.6 excels in three critical areas: audio-visual synchronization that tightly aligns voice rhythm with visual motion, high-quality audio output with rich layering that mirrors professional mixing standards, and robust semantic understanding of complex storylines and colloquial expressions.
The technology supports diverse audio types, such as speech, dialogue, narration, singing, rap, and mixed sound effects. This makes Kling VIDEO 2.6 valuable across advertising, social media, and e-commerce applications. Advertisers can generate complete product showcases with narration and sound effects with a single click, while social media creators can produce multi-character dialogues and music performances more efficiently. Kling VIDEO 2.6's release notes showcase several examples of outputs suitable for these and other use cases.
Comments