Get the latest tech news

Google shows off Lumiere, a space-time diffusion model for realistic AI videos


Lumiere was trained on a dataset of 30 million videos, along with their text captions, and is capable of generating 80 frames at 16 fps. The source of this data, however, remains unclear at this stage.

While these capabilities are not new in the industry and have been offered by players like Runway and Pika, the authors claim that most existing models tackle the added temporal data dimensions (representing a state in time) associated with video generation by using a cascaded approach. This works but makes temporal consistency difficult to achieve, often leading to restrictions in terms of video duration, overall visual quality, and the degree of realistic motion they can generate. Lumiere, on its part, addresses this gap by using a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model, leading to more realistic and coherent motion.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Google

Google

Photo of realistic AI videos

realistic AI videos

Photo of time diffusion model

time diffusion model

Related news:

News photo

Former CEO of Google has been quietly working on a military startup for “suicide” attack drones.

News photo

Duet AI comes to the classroom, and Google teases new Chromebooks

News photo

Google announces new AI-powered features for education