Everyone's favorite chatbot can Scarlett Fay Archivesnow see and hear and speak. On Monday, OpenAI announced new multimodal capabilities for ChatGPT. Users can now have voice conversations or share images with ChatGPT in real-time.
Audio and multimodal features have become the next phase in fierce generative AI competition. Meta recently launched AudioCraft for generating music with AI and Google Bard and Microsoft Bing have both deployed multimodal features for their chat experiences. Just last week, Amazon previewed a revamped version of Alexa that will be powered by its own LLM (large language model), and even Apple is experimenting with AI generated voice, with Personal Voice.
SEE ALSO: OpenAI expands ChatGPT 'custom instructions' to free usersVoice capabilities will be available on iOS and Android. Like Alexa or Siri, you can tap to speak to ChatGPT and it will speak back to you in one of five preferred voice options. Unlike, current voice assistants out there, ChatGPT is powered by more advanced LLMs, so what you'll hear is the same type of conversational and creative response that OpenAI's GPT-4 and GPT-3.5 is capable of creating with text. The example that OpenAI shared in the announcement is generating a bedtime story from a voice prompt. So, exhausted parents at the end of a long day can outsource their creativity to ChatGPT.
This Tweet is currently unavailable. It might be loading or has been removed.
Multimodal recognition is something that's been forecasted for a while, and is now launching in a user-friendly fashion for ChatGPT. When GPT-4 was released last March, OpenAI showcased its ability to understand and interpret images and handwritten text. Now it will be a part of everyday ChatGPT use. Users can upload an image of something and ask ChatGPT about it — identifying a cloud, or making a meal plan based on a photo of the contents of your fridge. Multimodal will be available on all platforms.
As with any generative AI advancement, there are serious ethics and privacy issues to consider. To mitigate risks of audio deepfakes, OpenAI says it is only using its audio recognition technology for the specific "voice chat" use case. Also, it was created with voice actors they have "directly worked with." That said, the announcement doesn't mention whether users' voices can be used to train the model, when you opt in to voice chat. For ChatGPT's multimodal capabilities, OpenAI says it has "taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy." But the real test of nefarious uses won't be known until it's released into the wild.
Voice chat and images will roll out to ChatGPT Plus and Enterprise users in the next two weeks, and to all users "soon after."
Topics Artificial Intelligence ChatGPT
A Hypochondriac’s Guide to Rare DiseasesA New Photo Book Lingers Between Baseball and the American DreamRemembering Jean Stein, 1934–2017Roadside MemorialsParadox Formation: Anelise Chen’s Meditations on the SnailStaff Picks: Proust, Sheepdogs, Lydia Davis, and MoreRemembering Jean Stein, 1934–2017Mary McCarthy SpeaksThis Nude George Washington Was Too Hot for the Nineteenth CenturyReading Isadora Duncan’s Pulpy AutobiographyReading Isadora Duncan’s Pulpy AutobiographyUnderwear Life: An Interview with Francesco Pacifico by Adam ThirlwellMisplaced Logic: An Interview with Joanna RuoccoTalking to Michael Robbins About Poetry, Capitalism, and Taylor Swift“I am glad if I can type zer0s”: Endre Tót’s Mail ArtRules for Consciousness in Mammals: On Clarice LispectorThe Art of Photographing People in Their CarsRemembering Jean Stein, 1934–2017Don’t Move Your House. Let Your House Move You.To Hölderlin (from Rilke with Love) Hungry dog licking a window gets a fulfilling Photoshop battle Princess Charlotte hugged a balloon display and yelled 'Dada' Volcanologist wants you to stop making that one 'Star Trek' joke, please How Congressmen are celebrating their grilling of Wells Fargo CEO Shocking images of police shooting Aboriginal man swarm social media This could be the world's most painful football celebration Over a dozen bread products in Australia recalled after metal pieces found inside, again Making a Murderer's Steven Avery to give Dr. Phil interview A city in Iceland turned off street lamps to show people the northern lights Social media documents New Jersey train crashing into station during rush hour ‘NCIS’ showrunner Gary Glasberg dies at 50 Black man police killed over his 'shooting stance' was holding a vape pen Why Africa is the world's untapped resource for tech talent Australian leaders falsely blame wind turbines for statewide power outage iPhone 8 sounds like it'll share design similarities with the iPhone 4 Google's big Android Wear 2.0 update is delayed 25 haunted house reactions that will make you shriek with laughter College student puts enormous landmarks to scale in photo series Steam Refunds: Friend or Foe? Court upholds right to take selfies in the voting booth
2.4751s , 10132.4296875 kb
Copyright © 2025 Powered by 【Scarlett Fay Archives】,Exquisite Information Network