Get the latest tech news

Tom and Jerry One-Minute Video Generation with Test-Time Training


A new approach using Test-Time Training (TTT) layers to generate coherent, minute-long videos from text.

Adding TTT layers into a pre-trained Transformer enables it to generate one-minute videos from text storyboards. Adding TTT layers into a pre-trained Transformer enables it to generate one-minute videos with strong temporal consistency and motion smoothness. TTT-MLP outperforms all other baselines in temporal consistency, motion smoothness, and overall aesthetics, as measured by human evaluation Elo scores.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of tom

tom

Photo of time training

time training

Photo of jerry one-minute

jerry one-minute

Related news:

News photo

Tom's Hardware: "Nintendo Switch 2 developers confirm DLSS, hardware ray tracing, and more"

News photo

ChatGPT's powerful 'Deep Research' upgrade got an open source replica — in just 24 hours | Tom's Guide

News photo

Trump to impose 25% to 100% tariffs on Taiwan-made chips, impacting TSMC | Tom's Hardware