ChatGPT Advanced Voice is out — 9 examples showing why you should be excited
When you buy through links on our articles, Future and its syndication partners may earn a commission.
After months of anticipation, ChatGPT’s advanced Voice Mode has now started to become available to small groups of ChatGPT Plus users. Their reaction has been very enthusiastic as they jumped on the opportunity to try out the new features developed by OpenAI.
The main features of the new voice mode are that it offers more natural, real-time conversations. You can interrupt ChatGPT at any time and it can sense and respond to your emotions. There are some limits, though; ChatGPT can’t mimic any famous personalities and is limited to speaking in four preset voices.
Several users who got access to the new features eagerly posted the results of their conversations with ChatGPT and the initial results seem pretty impressive. Don’t forget to turn up the volume as you check them out for yourself.
1. It can show excitement
One user asked ChatGPT to keep on dialling up its excitement as it narrated a fictitious soccer match. And ChatGPT obliged. Its first attempt was ok, but it actually sounded genuinely more excited as it was asked to give it another go while trying to sound even more excited. It’s a great example of how users should be able to fine-tune ChatGPT’s voice outputs.
2. It can cry
ChatGPT sounded like it was about to burst into tears as it was asked to recite the poem I measure every Grief I meet by Emily Dickinson. It impressively managed to clearly enunciate every word while making it feel as though the waterworks were going to start any second.
3. It can beatbox
Can ChatGPT beatbox? Absolutely! Asked to create a short birthday rap, the chatbot spit out a few bars and wrapped it up with some beatboxing. The first attempt was a bit too short for this X user who asked ChatGPT to increase the amount of beatboxing. On the second attempt, ChatGPT did as it was instructed to do. Pretty nifty!
4. It is a storyteller
In voice mode ChatGPT is able to respond to prompts normally except that it speaks its answers out loud rather than simply returning a text reply to your request. Here ChatGPT was asked to tell a children’s story about a computer that comes alive. While it wasn’t quite able to fulfil the user’s request to emphasize certain words and use tone variations, as typically done by storytellers, it was able to seamlessly switch from one language to another as it told the same story. Even though it was interrupted with these requests while it was speaking, this proved to be no challenge for the AI.
5. It can create sound effects
On the same theme of storytelling, in this example ChatGPT was asked to narrate a sci-fi thriller and in seconds, a newly created character was chasing a rogue AI and ended up in a shootout. The AI was also asked to create an atmosphere to enhance the story, particularly by through using onomatopoeia – the use of words that create the same sound as what they describe. The advanced Voice Mode also inserted a couple of actual (albeit basic) sound effects for good measure.
6. It can identify chords
“Go it! Here’s a clear C minor chord,” ChatGPT said before going on to reproduce the chord. While it sounds a bit off key, it might be because the example features a phone filming another phone. It will be more important to know if ChatGPT intends to continue in this trajectory of being able to describe what music and sound effects you’d like to hear and have it deliver the results to you in seconds.
7. It can perform tongue twisters
Another user asked ChatGPT to come up with some tongue twisters. Not only did the chat bot come up with them on the fly but it also read them out. It would be interesting to see how it would sound if it rattled the same example off for a number of consecutive times but it’s unlikely that the AI would stumble since it simply has to repeat its first iteration. Furthermore, it is unlikely to stumble on any words unless explicitly told to do so in general.
8. It can count very fast
This is a fun one! ChatGPT was asked to count as fast as it could up to 10 – a task which it handled with ease. It also managed to count up to 50 and it also stopped midway to catch its breath. Not that it needed to of course, but it sure makes it seem as if you’re chatting with a human.
“Interestingly, the transcript has no interruptions or notations – the voice model has simply learned natural speaking patterns, which includes breathing pauses. Uncanny,” X user Cristiano Giardina wrote.
9. It can do bad impressions
Finally, it can do impressions of famous characters, just not very well. It plays to the stereotype such as carrots for Bugs Bunny and Doh! for Homer Simpson.
Cristiano Giardina ran this test, writing on X: "ChatGPT Advanced Voice Mode doing a few impressions," including Bugs Bunny, Yoda, Homer Simpson plus a combination of Yoda + Homer.