Ghostboard pixel

Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn't arrive within 3 minutes, check your spam folder.

Ok, Thanks

Uninterrupted diffusion with Imagen Video

The Google Brain team has created the best Text-to-Video solution, Imagen Video. It is an AI system capable of generating video clips based on a text query. The text-based video diffusion model can generate videos at up to 1280×768 resolution at 24 frames per second. 0:00/1× Given

Dmitry Spodarets profile image
by Dmitry Spodarets
Uninterrupted diffusion with Imagen Video

The Google Brain team has created the best Text-to-Video solution, Imagen Video. It is an AI system capable of generating video clips based on a text query. The text-based video diffusion model can generate videos at up to 1280×768 resolution at 24 frames per second.

0:00
/

Given a text-based query, Imagen Video generates high-definition video using a basic video generation model and a sequence of alternating spatial and temporal superresolution video models.

Imagen Video is a so-called "diffusion" model, which consists of a text encoder (frozen T5-XXL), a basic video diffusion model, and alternating spatial and temporal superresolution diffusion models. It generates new data (e.g., video) by learning how to "break down" and "restore" multiple existing data samples.

0:00
/

A particular development feature is Video U-Net, a video-unet architecture whose spatial operations are performed independently on frames with common parameters (batch x time, height, width, channels), while temporal operations work already on the entire 5-dimensional tensor (batch, time, height, width, channels).

Not only is Imagen Video capable of generating video with high fidelity, but it also has a high degree of control and knowledge of the world, including the ability to generate a variety of video and text animations in a variety of artistic styles and with a 3D understanding of objects.

0:00
/

Imagen Video is based on Google's Imagen, an image generation system comparable to DALL-E 2, which was previously reported to have been taken off the beta waiting list, and users can now start using it at any time.

Dmitry Spodarets profile image
by Dmitry Spodarets

Data Phoenix Digest

Subscribe to the weekly digest with a summary of the top research papers, articles, news, and our community events, to keep track of trends and grow in the Data & AI world!

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More