Get the latest tech news

Video-Guided Foley Sound Generation with Multimodal Controls


Video-Guided Foley Sound Generation with Multimodal Controls

We generate a synchronized soundtrack for slient videos given a text prompt. bird chirping rooster crowing male speaking sheep bleating typewriter playing piano typing on computer keyboard

Get the Android app

Or read this on Hacker News

Read more on:

Photo of video

video

Photo of multimodal controls

multimodal controls

Related news:

News photo

‘Surreal Elderhood’ using OpenAI’s text-to-video model, Sora

News photo

Horizon, Death Stranding and The Last of Us star in PlayStation's nostalgia-fuelled 30th anniversary thank you video

News photo

Show HN: Visualizing website carbon footprints using steam and robotics [video]