Lip sync battles, four generative AI avatar apps breaking new ground
Not since the 1989 Milli Vanilli scandal has talk of lip syncing being on everyone’s, well, lips.
Generative AI start-ups are bending our minds with talking animated AI Avatars, and the ability to either choose an off-the-shelf fake avatar, or to train an AI model on your likeness and voice, to create a lip synched deep-fake of yourself.
You (or your brand ambassador, or a store representative, or your CEO) can speak any script, in any language, with just the strike of the keyboard and a few minutes of generation time. Want to speak the lyrics to Milli Vanilli’s hit song Baby Don't Forget My Number? I did, and here it is; https://timeundertension.substack.com/p/lip-sync-battles-four-generative
(I chose a German voice, to sound more authentically MV)
The beat has been picking up with several AI Avatar start-ups launching new features in the past month;
Gen AI video company Pika announced a new lip-sync feature (up to 15 seconds)
Verbalate (Aussie start-up) released video translation and lip-syncing / dubbing
Hey Gen released a beta of their Live Streaming Avatars
D-ID released their equivalent, which they call D-ID AI Agents
The most impressive thing to me is that now in addition to quickly producing pre-rendered videos, some of these tools now allow you to connect a LIVE avatar to a Large Language Model (like ChatGPT) and have real-time conversations.
These streaming avatars allow you to train not just the likeness and the voice, but also the tone of voice and knowledge-base that the avatar can talk to. This can be particularly powerful for staff training, or instore customer experience with retailers.
Translation is another stunningly impressive feature of many of these tools. In Hey Gen you can literally press a button to have your script translated into another language, and then render a new video. Here is the same video from above, but now I can magically speak German! https://timeundertension.substack.com/p/lip-sync-battles-four-generative
At Time Under Tension we have been following these developments closely, and we are building demos for clients showcasing these apps for a number of use cases;
Creating safety training videos at record speed for a Government department
Training the AI model on a celebrity ambassador, to create personalised videos at scale
Giving a streaming avatar detailed product information, to create a virtual instore assistant for customers to speak to during busy store periods
Going beyond these use cases, there are some amazing demonstrations of avatars being used as virtual influencers, such as Rae by Dentsu.
Avatar and lip sync AI technology is advancing rapidly, with the potential for misuse coming into stark attention. If you thought Milli Vanilli lip syncing their songs on the MTV Tour was bad, spare a thought for the Hong Kong finance worker who was “tricked into paying out $25 million, to fraudsters using deepfake technology to pose as the company’s chief financial officer in a video conference call”.
In the absence of regulation, the onus is on the platforms (Hey Gen do well with this, requiring authentication before creating a video avatar) and us practitioners to use the tools responsibly.
Furthermore, there are still many limitations in the technology, such as;
The quality of lip syncing sometimes not being quite perfect (with an ‘uncanny valley’ effect)
Live / streaming avatar apps are mostly in beta, and the chats can be a bit glitchy. Also adding ‘guardrails’ to the chat is incredibly important in this scenario
Actors and celebrities might be somewhat reluctant to have their likeness used with AI (this was a sticking point in the recent Hollywood actor’s strike). Interestingly, Soul Machines are building endorsed Digital Celebrities.
While we’re on the topics of uncanny, glitchy celebrities, you can’t close this article without watching the Milli Vanilli music video for Baby Don't Forget My Number, gen AI can’t come close to this quality.
Work with Time Under Tension
We work with agencies, companies and brands to elevate your Customer & Employee experience with generative AI. Our advisory team help you to understand what is possible, and how it relates to your business. We provide training for you to get the most of generative AI apps such as ChatGPT and Midjourney. Our technical team build bespoke tools to meet your needs. You can reach us here: www.timeundertension.ai/contact
A handful of Gen AI news
Here are five of the most interesting things we have seen and read in the last week;
Elon Musk is suing OpenAI, this is OpenAI’s rebuttal
A great article on AI workflows for marketers
How to not AI: “The worst AI-generated artwork we’ve seen”: Queensland Symphony Orchestra’s Facebook ad
The Shape of AI: How will patterns and experiences evolve in a world shaped by Artificial Intelligence?
Claude 3 was released to much fanfare, very impressive;
That’s a wrap for Issue #9. Drop us a comment and let us know what you think of overworked Milli Vanilli metaphors.