Meta takes some big AI swings at Meta Connect 2024

Screenshot by David Gewirtz/ZDNET

Mark Zuckerberg took the stage at Meta Connect 2024 and came out strong in the categories of VR/AR and AI. There’s a lot of mixing of these technologies, particularly in the Meta glasses line discussed elsewhere on ZDNET.

Also: Everything announced at Meta Connect 2024: $299 Quest 3S, Orion AR glasses, and more

In this article, though, we’ll dig into several powerful and impressive announcements related to the company’s AI efforts.

Multimodal large language model

Zuckerberg announced the availability of Llama 3.2, which adds multimodal capabilities. In particular, the model can understand images.

He compared Meta’s Llama 3.2 large language models with other LLMs, saying Meta “Differentiates itself in this category by offering not only state of the art models, but unlimited access to those models for free, and integrated easily into our different products and apps.”

Also: Meta inches toward open-source AI

Meta AI is Meta’s AI assistant, now based on Llama 3.2. Zuckerberg stated Meta is on track to be the most used AI assistant globally, having almost 500 million monthly active users.

tie-die — Screenshot by David Gewirtz/ZDNET

To demonstrate the model’s understanding of images, Zuckerberg opened an image on a mobile device using the company’s image-edit capability. Meta AI was able to change the image, modifying a shirt to tie-dye or adding a helmet, all in response to simple text prompts.

Meta AI with voice

Meta’s AI assistant is now able to hold voice conversations with you from within Meta’s apps. I’ve been using a similar feature in ChatGPT and found it useful when two or more people need to hear the answer to a question.

john-cena — Screenshot by David Gewirtz/ZDNET

Zuckerberg claims that AI voice interaction will be bigger than text chatbots, and I agree — with one caveat. Getting to the voice interaction has to be easy. For example, to ask Alexa a question, you simply speak into the room. But to ask ChatGPT a question on the iPhone, you have to unlock the phone, go into the ChatGPT app, and then enable the feature.

Also: AI voice generators: What they can do and how they work

Until Meta has devices that just naturally listen for speech, I fear even the most capable voice assistants will be constrained by inconvenience.

You can also give your AI assistant a celebrity voice. Choose from John Cena, Judi Dench, Kristen Bell, Keegan-Michael Key, and Awkwafina. Natural voice conversation will be available in Instagram, WhatsApp, and Messenger Facebook and is rolling out today.

Meta AI Studio

Next up are some features Meta has added to its AI Studio chatbot creation tool. AI Studio lets you create a character (either an AI based on your interests or an AI that “is an extension of you”). Essentially, you can create a chatbot that mirrors your conversational style.

But now Meta is diving into the realm of uncanny valley deepfakes.

AI Studio, until this announcement, contained a text-based interface. But Meta is releasing a version that is “more natural, embodied, interactive.” And when it comes to “embodied”, they’re not kidding around.

In the demo, Zuckerberg interacted with a chatbot modeled on creator Don Allen Stevenson III. This interaction appeared to be a “live” video of Stevenson, full and completely tracking head motion and lip animations. Basically, he could ask Robot Don a question and it looked like the real guy was answering.

Also: How Apple, Google, and Microsoft can save us from AI deepfakes

Powerful, freaky, and unnerving. Plus, the potential for creating malicious chatbots using other folks’ faces seems a distinct possibility.

AI translation

Meta seems to have artificial lip-synch and facial movements tied down. They’ve reached a point where they can make a real person’s face move and speak generated words.

Meta has extended this capability to translation. They now offer automatic video dubbing on Reels, in English and Spanish. That feature means you can record a Reel in Spanish, and the social will play it back in English — and it will look like you’re speaking English. Or you can record in English and it will play back in Spanish, as if you’re speaking in Spanish.

ai-translation — Screenshot by David Gewirtz/ZDNET

In the above example, creator Ivan Acuña spoke in Spanish, but the dub came back in English. As with the previous example, the video was nearly perfect and it looked like Acuña had been recorded speaking English originally.

Llama 3.2

Zuckerberg came back for another dip into the Llama 3.2 model. He said the multimodal nature of the model has increased the parameter count considerably.

Another interesting part of the announcement was the much smaller 1B and 3B models optimized to work on-device. This effort will allow developers to create more secure and specialized models for custom apps, that live right in the app.

Also: I’ve tested dozens of AI chatbots since ChatGPT’s stunning debut. Here’s my top pick

Both of these models are open source, and Zuckerberg was touting the idea that Llama is becoming “the Linux of the AI industry”.

Finally, a bunch more AI features were announced for Meta’s AI glasses. We have another article that goes into those features in detail.

You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, and on YouTube at YouTube.com/DavidGewirtzTV.

Source link

Meta takes some big AI swings at Meta Connect 2024

Multimodal large language model

Meta AI with voice

Meta AI Studio

AI translation

Llama 3.2

VMWARE

Helping Public Sector Organisations Define Cloud Strategy

How to change the VLAN ID of the Service Console in ESX from the command line/console

Cisco UCS and Vmware Interfaces (Vnics) HA Design Considerations

Troubleshooting network and TCP/UDP port connectivity issues on ESX/ESXi(2020669)

vSphere Client Parameters

Configuration Templates

CUE Licenses

Trouble shooting Unity Express with Call Manager Integeration & Operational Issues

CME Configuration Example: SIP Trunks to Viatalk and VoIP.ms

SIP Phone registration – CME Configuration

CUE Voicemail + VPIM networking (CUE to unity)

Related Post

Multimodal large language model

Meta AI with voice

Meta AI Studio

AI translation

Llama 3.2

VMWARE

Configuration Templates