- Join BJ's Wholesale Club for $20, and get a $20 gift card: Deal
- Delivering better business outcomes for CIOs
- Docker Desktop 4.35: Organization Access Tokens, Docker Home, Volumes Export, and Terminal in Docker Desktop | Docker
- Cybercriminals Exploit DocuSign APIs to Send Fake Invoices
- Your iPhone's next iOS 18.2 update may come earlier than usual - with these AI features
OpenAI rolls out highly anticipated advanced Voice Mode, but there's a catch
When OpenAI held its Spring Launch event in May, one of the biggest standouts was its demo of the new Voice Mode on ChatGPT, supercharged with GPT-4o’s new video and audio capabilities. The highly anticipated new Voice Mode is finally here (kind of).
Also: The best AI chatbots of 2024: ChatGPT, Copilot, and worthy alternatives
On Tuesday, OpenAI announced via an X post that Voice Mode is being rolled out in alpha to a small group of ChatGPT Plus users, offering them a smarter voice assistant that can be interrupted and respond to users’ emotions.
Users who participate in the alpha will receive an email with instructions and a message in the mobile app, as shown in the video above. If you haven’t received a notification just yet, no worries. OpenAI shared that it will continue to add users on a rolling basis, with the plan for all ChatGPT Plus users to access it in the fall.
In the original demo at the launch event, shown below, the company showcased Voice Mode’s multimodal capabilities, including assisting with content on users’ screens and using the user’s phone camera as context for a response.
However, the alpha of Voice Mode will not have these features. OpenAI shared that “video and screen sharing capabilities will launch at a later date.” The company also said that since originally demoing the technology, it has improved the quality and safety of voice conversations.
OpenAI tested the voice capabilities with 100+ external red teamers across 45 languages, according to the X thread. The company also trained the model to speak only in the four preset voices, block outputs that deviate from those designated voices, and implement guardrails to block requests.
The company also said that user feedback will be taken into account to improve the model further, and it will share a detailed report regarding GPT-4os performance, including limitations and safety evaluations, in August.
Also: Google’s new gen AI tools help hyper-target your ad campaigns
You can become a ChatGPT Plus subscriber for $20 per month. Other membership perks include advanced data analysis features, image generation, priority access to GPT-4o, and more.
One week after OpenAI unveiled this feature, Google unveiled a similar feature called Gemini Live. However, Gemini Live is not yet available to users. That may change soon at the Made by Google event coming up in a few weeks.