Ars Technica
A voice synthesis firm based mostly in Dubai revealed a fictional podcast interview between Joe Rogan and Steve Jobs utilizing sensible voices digitally cloned from each males. It takes place through the “first episode” of a purported podcast collection referred to as “Podcast.ai,” created by Play.ht, which sells voice synthesis companies.
Within the interview, you first hear a replication of Rogan’s voice created by voice cloning expertise much like that which we have lined earlier than on Ars. Deep studying expertise has allowed AI fashions to duplicate distinctive voices with a excessive diploma of accuracy, akin to within the case of Darth Vader in Disney’s Obi-Wan Kenobi TV collection.
To realize the impact, somebody should first prepare the AI mannequin on current samples of the voice that can be cloned. Rogan is a major goal for AI voice coaching by deep studying fashions as a result of ample portions of his remoted voice exist on his podcasts. In truth, The Verge lined a PR stunt by an AI firm referred to as Dessa synthesizing Rogan in 2019.
The place this occasion of AI tomfoolery turns into extra attention-grabbing is that Play.ht moreover roped within the voice of deceased Apple CEO Steve Jobs. His voice, whereas robotically uneven at instances, remembers his Apple keynotes and All Issues Digital interviews from the late 2000s. And Play.ht claims that the textual content of the interview was generated by AI as properly, probably from a big language mannequin (LLM) much like GPT-3.
“Transcripts are generated with fine-tuned language fashions,” writes Play.ht on the Podcast.ai web site. “For instance, the Steve Jobs episode was educated on his biography and all recordings of him we might discover on-line so the AI might precisely carry him again to life.”
In step with its LLM roots, the 19-minute interview would not make a lot sense. After some time, elements of the fictional interview start to sound like conceptual mashups of widespread Jobs speaking factors, together with aesthetics, revolutionary merchandise, rivals akin to Google, Microsoft, and Adobe, and the triumphs of the unique Macintosh.
For instance, throughout a piece of the interview, faux Jobs delves into criticism of Microsoft that’s similar to what the true Jobs stated in a well-known 1995 interview for Triumph of the Nerds, nevertheless it’s not a carbon-copy—and you may inform the voice is synthesized when you examine the 2. “That is the issue I’ve all the time had with Microsoft,” faux Jobs says. “In some ways they’re good folks and so they’ve performed good work, however they’ve by no means had any style. They’ve by no means had any aesthetic sense.”
Whether or not it is authorized to make use of Jobs’ or Rogan’s vocal likenesses on this method—notably to advertise a industrial product—stays to be seen. And regardless of the PR-stunt nature of the podcast, the idea of completely fictional superstar podcasts received our consideration. As voice synthesis turns into extra widespread and probably undetectable, we’re taking a look at a future the place media artifacts from any period will probably be fully fluid and malleable, shapable to suit any narrative. On this explicit fictional world, Jobs is a large Rogan fan.
“It is good to take a seat again within the automobile and hearken to you rant,” he says.