Get the latest tech news
Pushing the frontiers of audio generation
Our pioneering speech generation technologies are helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools.
Over the past few years, we’ve been pushing the frontiers of audio generation, developing models that can create high quality, natural speech from a range of inputs, like text, tempo controls and particular voices. This technology powers single-speaker audio in many Google products and experiments — including Gemini Live, Project Astra, Journey Voices and YouTube’s auto dubbing — and is helping people around the world interact with more natural, conversational and intuitive digital assistants and AI tools. Authors of this work: Zalán Borsos, Matt Sharifi, Brian McWilliams, Yunpeng Li, Damien Vincent, Félix de Chaumont Quitry, Martin Sundermeyer, Eugene Kharitonov, Alex Tudor, Victor Ungureanu, Sertan Girgin, Jonas Rothfuss, Jake Walker and Marco Tagliasacchi.
Or read this on Hacker News