Get the latest tech news
Moonshine, the new state of the art for speech to text
Can you imagine using a keyboard where it took a key press two seconds to show up on screen? That’s the typical latency for most voice interfaces, so it’s no wonder they’ve failed…
Today we’re open sourcing Moonshine, a new speech to text model that returns results faster and more efficiently than the current state of the art, OpenAI’s Whisper, while matching or exceeding its accuracy. Moonshine doesn’t just help us with products like Torre, its unique design makes it possible to fit full automatic speech recognition on true embedded hardware. Even the smallest Whisper model requires at least 30MB of RAM, since modern transformers create large dynamic activation layers which can’t be stored in flash or other read-only memory.
Or read this on Hacker News