Connect with us

OpenAI’s ChatGPT To Support Voice Chat and Image Prompts With Latest Update

Published

on

Courtesy of khiemh/Instagram
OpenAI on Monday announced it is rolling out new voice and image capabilities for ChatGPT, giving the chatbot the ability to “see, hear, and speak.”
Microsoft-backed OpenAI’s primary large language model can now carry out voice conversations with users, according to the announcement. The feature is powered by a new text-to-speech model, “capable of generating human-like audio from just text and a few seconds of sample speech,” paired with Whisper, a speech recognition system, to transcribe the user’s words to text.
Users can ask for directions, request a “bedtime story,” or “settle a dinner table debate” using the new voice features, the company said. Professional voice actors provided five voices that users can choose from.
“Thanksgiving 2023 is going to be a holiday to remember. Think of the opportunities from troubleshooting the turkey fryer to asking ChatGPT simple questions like how long to cook a 22-pound turkey and a variety of traditional and special recipes,” Marva Bailer, an AI advisor, said, reacting to the new features.
“It will be fun to see how it levels the playing field with a multi-generational audience,” Bailer added.
ChatGPT will also be able to view images uploaded by the user. The company said the feature will allow users to “troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data.”
Users can have ChatGPT focus on a specific part of the image by using the drawing tool in the ChatGPT mobile app for Android or iOS.
To address concerns about ChatGPT’s analysis of images containing people, OpenAI said it has taken “technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy.”
The new voice and image features will be rolled out to Plus and Enterprise users over the next two weeks, the company said. Voice will be available on iOS and Android by opting in via the app’s settings, and images will be available on all platforms.
“It will be interesting to see how this resource creates harmony, similar to our map applications versus the old paper map,” Bailer said.
“The roadmap of any technology is to make itself invisible to the consumer, and ChatGPT adding voice capabilities furthers that goal exponentially,” technologist Peter Swain said.
“AI’s impact has already been disruptive. This new form of engagement is only going to further the impact it can have for individuals and businesses alike.”
TMX contributed to this article.