Get the latest tech news

Moonshine, the new state of the art for speech to text


Can you imagine using a keyboard where it took a key press two seconds to show up on screen? That’s the typical latency for most voice interfaces, so it’s no wonder they’ve failed…

Today we’re open sourcing Moonshine, a new speech to text model that returns results faster and more efficiently than the current state of the art, OpenAI’s Whisper, while matching or exceeding its accuracy. Moonshine doesn’t just help us with products like Torre, its unique design makes it possible to fit full automatic speech recognition on true embedded hardware. Even the smallest Whisper model requires at least 30MB of RAM, since modern transformers create large dynamic activation layers which can’t be stored in flash or other read-only memory.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of Art

Art

Photo of speech

speech

Photo of new state

new state

Related news:

News photo

Marantz’s first speakers look like works of art

News photo

Doctor Fukushi Masaichi and the art of preserving tattooed skin

News photo

Meta Spirit LM: Open multimodal language model that freely mixes text and speech