OpenAI’s New Voice Models Can Now Talk Like Customer Service Agents. Their Next Destination: Call Centers

  • These models are based on GPT-4o and GPT-4o-mini.

  • They aim to enhance the capabilities of Whisper and previous text-to-speech tools.

  • Developers can access these models through an API to enhance their projects.

Call center
No comments Twitter Flipboard E-mail
javier-marquez

Javier Márquez

Writer
  • Adapted by:

  • Alba Mora

javier-marquez

Javier Márquez

Writer

I've been in media for over a decade, but I've been marveling at the possibilities that technology brings us much longer. I believe we live in a world where the digital revolution is changing everything and that Xataka is the best place to write about it.

149 publications by Javier Márquez
alba-mora

Alba Mora

Writer

An established tech journalist, I entered the world of consumer tech by chance in 2018. In my writing and translating career, I've also covered a diverse range of topics, including entertainment, travel, science, and the economy.

302 publications by Alba Mora

Since early this year, major tech companies have had a clear goal: encouraging conversations with AI. OpenAI, Microsoft, Google, and Meta have been adding voice features to their assistants, but this seems to be just the start. The industry is moving at a rapid pace, and the way users interact with these tools continues to evolve.

Say hello to voice agents. OpenAI has focused on text-based agents for months, with tools such as Operator and Computer-Using Agents. However, the company is preparing its next big move to stand out in the AI development race: the launch of a powerful new generation of voice agents.

New models are entering the scene. On Thursday, OpenAI announced the release of new audio models for converting speech to text and vice versa. These models aren’t part of ChatGPT but will be available through an API, allowing developers to create speech agents. Notably, these models aim to be much more accurate and offer enhanced customization.

OpenAI’s new models will be built on GPT-4o and GPT-4o-mini. They promise improvements over Whisper and previous text-to-speech tools, which will continue to be available via the API. However, performance isn’t the only focus. These models can also modulate their tone to sound to “talk like a sympathetic customer service agent” and more.

OpenAI

Next destination: call centers. OpenAI has made its intentions clear with the recent release. “For the first time, developers can also instruct the text-to-speech model to speak in a specific way… unlocking a new level of customization for voice agents. This enables a wide range of tailored applications, from more empathetic to dynamic customer service voices to expressive narration for creative storytelling experiences,” the company says.

According to OpenAI, this technology will facilitate the creation of much richer “conversational experiences.” ChatGPT was launched in November 2022 powered by GPT-3.5. This makes it evident that progress has been rapid. More importantly, all signs indicate that these models will eventually be implemented in call centers.

Interactions might be somewhat limited initially, but they’ll certainly surpass current voice systems. These new models will move beyond traditional automated assistants, becoming far more natural in their interactions. Over time, the distinction between conversing with a human and an AI could become nearly indistinguishable.

Images | LumenSoft Technologies | OpenAI

Related | I Tested ChatGPT’s Advanced Voice Mode. It’s the Beginning of Something Huge

Home o Index
×

We use third-party cookies to generate audience statistics and display personalized advertising by analyzing your browsing habits. If you continue browsing, you will be accepting their use. More information