Get the latest tech news

Hume launches new text-to-speech model Octave that generates custom AI voices with adjustable emotions


Hume emphasized its OCTAVE TTS pricing is around half the cost of competing AI voice creation startup ElevenLabs.

“We’re launching the first LLM for text-to-speech—a model that understands words in context, predicting the right emotions, rhythm, cadence, and emphasis, making speech sound more human than ever before,” said Alan Cowen, Hume AI’s co-founder and CEO, in a video call interview with VentureBeat. “We collected data from people recording themselves through webcams, reacting naturally to videos, telling stories, and talking to others, including friends and family, to capture a wide range of emotional expressions,” Cowen said. Hume has technical guardrails built into its website and API prohibiting the creation of realistic children’s voices and imitations of specific individuals, but other than that, it is open to use across a wide range of content and subject, including potentially not-safe-for-work scenes such as those in popular romance novels.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of speech

speech

Photo of new text

new text

Photo of Hume

Hume

Related news:

News photo

This open text-to-speech model needs just seconds of audio to clone your voice

News photo

Meta is boosting its speech and translation AI with a fresh program

News photo

Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser