Get the latest tech news
Hume launches new text-to-speech model Octave that generates custom AI voices with adjustable emotions
Hume emphasized its OCTAVE TTS pricing is around half the cost of competing AI voice creation startup ElevenLabs.
“We’re launching the first LLM for text-to-speech—a model that understands words in context, predicting the right emotions, rhythm, cadence, and emphasis, making speech sound more human than ever before,” said Alan Cowen, Hume AI’s co-founder and CEO, in a video call interview with VentureBeat. “We collected data from people recording themselves through webcams, reacting naturally to videos, telling stories, and talking to others, including friends and family, to capture a wide range of emotional expressions,” Cowen said. Hume has technical guardrails built into its website and API prohibiting the creation of realistic children’s voices and imitations of specific individuals, but other than that, it is open to use across a wide range of content and subject, including potentially not-safe-for-work scenes such as those in popular romance novels.
Or read this on Venture Beat