Get the latest tech news

OpenAI’s new voice AI model gpt-4o-transcribe lets you add speech to your existing text apps in seconds


Three, all new proprietary voice models called gpt-4o-transcribe, gpt-4o-mini-transcribe and gpt-4o-mini-tts.

Today, the ChatGPT maker has unveiled three, all new proprietary voice models called gpt-4o-transcribe, gpt-4o-mini-transcribe and gpt-4o-mini-tts, available initially in its application programming interface (API) for third-party software developers to build their own apps atop, as well as on a custom demo site, OpenAI.fm, that individual users can access for limited testing and fun. It is meant to supersede OpenAI’s two-year-old Whisper open source text-to-speech model, offering lower word error rates across industry benchmarks and improved performance in noisy environments, with diverse accents, and at varying speech speeds — across 100+ languages. Another startup, Hume AI offers a new model Octave TTS with sentence-level and even word-level customization of pronunciation and emotional inflection — based entirely on the user’s instructions, not any pre-set voices.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of OpenAI

OpenAI

Photo of speech

speech

Photo of seconds

seconds

Related news:

News photo

OpenAI uses open source Ory to authenticate over 400M weekly active users

News photo

OpenAI upgrades its transcription and voice-generating AI models

News photo

Dad demands OpenAI delete ChatGPT’s false claim that he murdered his kids | Blocking outputs isn't enough; dad wants OpenAI to delete the false information.