15.ai was a free non-commercial web application that used artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by an anonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at the Massachusetts Institute of Technology, the application allowed users to make characters from video games, television shows, and movies speak custom text with emotional inflections faster than real-time.[a] The platform was notable for its ability to generate convincing voice output using minimal training data—the name "15.ai" referenced the creator's claim that a voice could be cloned with just 15 seconds of audio, in contrast to contemporary deep learning speech models which typically required tens of hours of audio data. It was an early example of an application of generative artificial intelligence during the initial stages of the AI boom.
Launched in March 2020, 15.ai gained widespread attention in early 2021 when content utilizing it went viral on social media platforms like YouTube and Twitter, and quickly became popular among Internet fandoms, such as the My Little Pony: Friendship Is Magic, Team Fortress 2, and SpongeBob SquarePants fandoms. The service distinguished itself through its support for emotional context in speech generation through emojis, precise pronunciation control through phonetic transcriptions, and multi-speaker capabilities that allowed a single model to generate diverse character voices. 15.ai is credited as the first mainstream platform to popularize AI voice cloning (audio deepfakes) in memes and content creation.[1]
15.ai received varied responses from the voice acting community and broader public. Voice actors and industry professionals debated the technology's merits for fan creativity versus its potential impact on the profession, particularly following controversies over unauthorized commercial use. While many critics praised the application's uniqueness, accessibility, and emotional control, they also noted technical limitations in areas like prosody options (such as rhythm and loudness) and non-English language support. 15.ai prompted discussions about ethical implications, including concerns about reduction of employment opportunities for voice actors, voice-related fraud, and misuse in explicit content, though 15.ai maintained strict policies against replicating real people's voices.
15.ai's approach to data-efficient voice synthesis and emotional expression was influential in subsequent developments in AI text-to-speech technology. In January 2022, Voiceverse NFT sparked controversy when it was discovered that the company had misappropriated 15.ai's work for their own platform. This incident was subsequently documented in resources tracking AI ethics violations and NFT-related controversies. The service was ultimately taken offline in September 2022 due to legal issues surrounding artificial intelligence and copyright. Its shutdown was followed by the emergence of various commercial alternatives in subsequent years, with their founders acknowledging 15.ai's pioneering influence in the field of deep learning speech synthesis.
History
Background

The field of artificial speech synthesis underwent a significant transformation with the introduction of deep learning approaches. In 2016, DeepMind's publication of the seminal paper WaveNet: A Generative Model for Raw Audio marked a pivotal shift toward neural network-based speech synthesis, demonstrating unprecedented audio quality through causal convolutional neural networks. Previously, concatenative synthesis—which worked by stitching together pre-recorded segments of human speech—was the predominant method for generating artificial speech, but it often produced robotic-sounding results at the boundaries of sentences.[2] Two years later, this was followed by Google AI's Tacotron 2 in 2018, which demonstrated that neural networks could produce highly natural speech synthesis but required substantial training data—typically tens of hours of audio—to achieve acceptable quality. When trained on smaller datasets, such as 2 hours of speech, the output quality degraded while still being able to maintain intelligible speech, and with just 24 minutes of training data, Tacotron 2 failed to produce intelligible speech.[3] The same year saw the emergence of HiFi-GAN, a generative adversarial network (GAN)-based vocoder that improved the efficiency of waveform generation while producing high-fidelity speech,[4] followed by Glow-TTS, which introduced a flow-based approach that allowed for both fast inference and voice style transfer capabilities.[5] Chinese tech companies also made significant contributions to the field, with Baidu and ByteDance developing proprietary text-to-speech frameworks that further advanced the technology, though specific technical details of their implementations remained largely undisclosed.[6]
Development, release, and operation
[...] The website has multiple purposes. It serves as a proof of concept of a platform that allows anyone to create content, even if they can't hire someone to voice their projects.
It also demonstrates the progress of my research in a far more engaging manner – by being able to use the actual model, you can discover things about it that even I wasn't aware of (such as getting characters to make gasping noises or moans by placing commas in between certain phonemes).
It also doesn't let me get away with picking and choosing the best results and showing off only the ones that work [...] Being able to interact with the model with no filter allows the user to judge exactly how good the current work is at face value.
15.ai was conceived in 2016 as a research project in deep learning speech synthesis by a developer known as "15" (at the age of 18[8]) during their freshman year at the Massachusetts Institute of Technology (MIT)[9] as part of MIT's Undergraduate Research Opportunities Program (UROP).[10] The developer was inspired by DeepMind's WaveNet paper, with development continuing through their studies as Google AI released Tacotron 2 the following year. By 2019, the developer had demonstrated at MIT their ability to replicate WaveNet and Tacotron 2's results using 75% less training data than previously required.[6] The name 15 is a reference to the creator's claim that a voice can be cloned with as little as 15 seconds of data.[11]
The developer had originally planned to pursue a doctorate based on their undergraduate research, but opted to work in the tech industry instead after their startup was accepted into the Y Combinator accelerator in 2019. After their departure in early 2020, the developer returned to their voice synthesis research, implementing it as a web application. According to a 2024 post on X from the developer, instead of using conventional voice datasets like LJSpeech that contained simple, monotone recordings, they sought out more challenging voice samples that could demonstrate the model's ability to handle complex speech patterns and emotional undertones.[tweet 1] The Pony Preservation Project—a fan initiative originating from /mlp/,[6] 4chan's My Little Pony board, that had compiled voice clips from My Little Pony: Friendship Is Magic—played a crucial role in the implementation. The project's contributors had manually trimmed, denoised, transcribed, and emotion-tagged every line from the show. This dataset provided ideal training material for 15.ai's deep learning model.[6]

15.ai was released in March 2020 with a limited selection of characters, including those from My Little Pony: Friendship Is Magic and Team Fortress 2.[12] The system was designed to function efficiently with limited training data—requiring only minutes of clean audio per character, in contrast to the 40+ hours typically needed by traditional deep learning models. To overcome data constraints, the developer employed specific data augmentation techniques to improve generalization, including deliberate introduction of spelling variations, punctuation patterns, and pronunciation distortions during training.[13]
Upon its launch, 15.ai was offered as a free and non-commercial service that did not require user registration or user accounts to operate, with the developer establishing terms of use that aligned with the project's research-oriented objectives. Users were permitted to create any content with the synthesized voices under two specific conditions: they must properly credit 15.ai by including the website URL in any posts, videos, or projects using the generated audio; and they were prohibited from mixing 15.ai outputs with other text-to-speech software in the same work, to prevent misrepresentation of the technology's capabilities.[14][15]
More voices were added to the website in the following months.[16] A significant technical advancement came in late 2020 with the implementation of a multi-speaker embedding in the deep neural network, enabling simultaneous training of multiple voices rather than requiring individual models for each character voice.[6] This not only allowed rapid expansion from eight to over fifty character voices,[8] but also let the model recognize common emotional patterns across characters, even when certain emotions were missing from some characters' training data.[17]
By May 2020, the site had served over 4.2 million audio files to users.[18] In early 2021, the application gained popularity after skits, memes, and fan content created using 15.ai went viral on Twitter, TikTok, Reddit, Twitch, Facebook, and YouTube.[19] At its peak, the platform incurred operational costs of US$12,000[6] per month from AWS infrastructure needed to handle millions of daily voice generations; despite receiving offers from companies to acquire 15.ai and its underlying technology, the website remained independent and was funded out of the personal previous startup earnings of the developer[6]—then aged 23 at the time.
Voiceverse NFT controversy

On January 14, 2022, a controversy ensued after it was discovered that Voiceverse NFT had misappropriated voice lines generated from 15.ai as part of their marketing campaign.[20] This came shortly after 15.ai's developer had explicitly stated in December 2021 that they had no interest in incorporating NFTs (non-fungible tokens) into their work.[21] Log files showed that Voiceverse had generated audio of characters from My Little Pony: Friendship Is Magic using 15.ai, pitched them up to make them sound unrecognizable from the original voices to market their own platform—in violation of 15.ai's terms of service which explicitly prohibited commercial use and required proper attribution.[22]
Voiceverse initially claimed their platform would allow NFT owners to possess commercial rights to AI-generated voices for content creation, in-game chats, and video calls, with plans to release 8,888 "Origins" profile pictures with unique AI voice models.[23] When confronted with evidence of their misappropriation, Voiceverse claimed that someone in their marketing team used the voice without properly crediting 15.ai and admitted in their Discord server that their marketing team had been in such a rush to create a partnership demo that they used 15.ai without waiting for their own voice technology to be ready.[24] In response to their apology, 15 tweeted "Go fuck yourself,"[25] which went viral, amassing thousands of retweets and likes on Twitter in support of the developer.[6] 15 later expressed deeper frustration, writing: "the entire field of vocal synthesis is now being misrepresented by charlatans who are only in it for the money."[26]
Troy Baker @TroyBakerVAI'm partnering with @VoiceverseNFT to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create. We all have a story to tell. You can hate. Or you can create. What'll it be?
January 14, 2022[tweet 2]
Following continued backlash and the plagiarism revelation, voice actor Troy Baker (who had partnered with Voiceverse) faced criticism for supporting an NFT project and his confrontational announcement tone.[27] Baker had described Voiceverse's service as allowing people to "create customized audiobooks, YouTube videos, e-learning lectures, or even podcasts with your favorite voice all without the hassle of additional legal work," which critics noted raised concerns about potentially replacing professional voice actors with AI.[28] Baker subsequently acknowledged that his original announcement tweet ending with "You can hate. Or you can create. What'll it be?" may have been "antagonistic,"[29] and on January 31, announced he would discontinue his partnership with Voiceverse.[30]
The event raised concerns about NFT projects, which critics observed were frequently associated with intellectual property theft and questionable business practices.[31] The incident was later documented in the AI Incident Database (AIID), cataloging it as an example of "an AI-synthetic audio sold as an NFT on Voiceverse's platform [that] was acknowledged by the company for having been created by 15.ai, a free web app specializing in text-to-speech and AI-voice generation, and reused without proper attribution."[32] The controversy was also featured in writer and crypto skeptic Molly White's Web3 Is Going Just Great project, which documented how Baker's partnership announcement and its antagonistic tone exacerbated negative reactions to the NFT initiative. White noted the vague nature of Voiceverse's offering, described only as "provid[ing] you an ownership to a unique voice in the Metaverse,"[33] and noted how the revelation of stolen work from 15.ai further damaged Voiceverse's credibility.[34] Russian educational platform Skillbox listed the incident as an example of fraud in NFTs.[35]
In a 2024 class action lawsuit filed against LOVO, Inc., court documents alleged that the founders of LOVO also created Voiceverse, with plaintiffs claiming that Voiceverse had "already been found to have stolen technology from [15.ai]".[36]
Inactivity
In September 2022, 15.ai was taken offline[37] due to legal issues surrounding artificial intelligence and copyright.[6] The creator has suggested a potential future version that would better address copyright concerns from the outset, though the website remains inactive as of 2025.[6]
Features

The platform was non-commercial,[38] and operated without requiring user registration or accounts.[39] Users generated speech by inputting text and selecting a character voice, with optional parameters for emotional contextualizers and phonetic transcriptions. Each request produced three audio variations with distinct emotional deliveries sorted by confidence score.[40] Characters available included multiple characters from Team Fortress 2 and My Little Pony: Friendship Is Magic, including Twilight Sparkle; GLaDOS, Wheatley, and the Sentry Turret from the Portal series; SpongeBob SquarePants; Kyu Sugardust from HuniePop, Rise Kujikawa from Persona 4; Daria Morgendorffer and Jane Lane from Daria; Carl Brutananadilewski from Aqua Teen Hunger Force; Steven Universe from Steven Universe; Sans from Undertale; Madeline and multiple characters from Celeste; the Tenth Doctor Who; the Narrator from The Stanley Parable; and HAL 9000 from 2001: A Space Odyssey.[41] Out of the over fifty[8] voices available, thirty were of characters from My Little Pony: Friendship Is Magic.[42] Certain "silent" characters like Chell and Gordon Freeman were able to be selected as a joke, and would emit silent audio files when any text was submitted.[43] Characters from Undertale and Celeste did not produce spoken words but instead generated their games' distinctive beeps when text was entered.[44]

15.ai generated audio at 44.1 kHz sampling rate—higher than the 16 kHz standard used by most deep learning text-to-speech systems of that period. This higher fidelity created more detailed audio spectrograms and greater audio resolution, though it also made any synthesis imperfections more noticeable. Users reported using Audacity to downsample any generated audio in order to mask apparent robotic artifacts, though this came at the cost of lower audio quality.[13] The system processed speech faster-than-real-time using customized deep neural networks combined with specialized audio synthesis algorithms.[45] While the underlying technology could produce 10 seconds of audio in less than 10 seconds of processing time (hence, faster-than-real-time), the actual user experience often involved longer waits as the servers managed thousands of simultaneous requests, sometimes taking more than a minute to deliver results.[46]
The deep learning model's nondeterministic properties produced variations in speech output, creating different intonations with each generation, similar to how voice actors produce different takes.[47] 15.ai introduced the concept of emotional contextualizers, which allowed users to specify the emotional tone of generated speech through guiding phrases.[6] The emotional contextualizer functionality utilized DeepMoji, a sentiment analysis neural network developed at the MIT Media Lab.[48] Introduced in 2017, DeepMoji processed emoji embeddings from 1.2 billion Twitter posts (from 2013 to 2017) to analyze emotional content. Testing showed the system could identify emotional elements, including sarcasm, more accurately than human evaluators.[49] If an input into 15.ai contained additional context (specified by a vertical bar), the additional context following the bar would be used as the emotional contextualizer.[50] For example, if the input was Today is a great day!|I'm very sad.
, the selected character would speak the sentence "Today is a great day!" in the emotion one would expect from someone saying the sentence "I'm very sad."[50] Certain characters, such as Twilight Sparkle from My Little Pony: Friendship Is Magic, offered preset emotional modes, who had specific options to output text in different emotional states such as "happy".[18]

The application used pronunciation data from Oxford Dictionaries API, Wiktionary, and CMU Pronouncing Dictionary,[51] the last of which is based on ARPABET, a set of English phonetic transcriptions originally developed by the Advanced Research Projects Agency in the 1970s. For modern and Internet-specific terminology, the system incorporated pronunciation data from user-generated content websites, including Reddit, Urban Dictionary, 4chan, and Google.[51] Inputting ARPABET transcriptions was also supported, allowing users to correct mispronunciations or specify the desired pronunciation between heteronyms—words that have the same spelling but have different pronunciations. Users could invoke ARPABET transcriptions by enclosing the phoneme string in curly braces within the input box (for example, {AA1 R P AH0 B EH2 T}
to specify the pronunciation of the word "ARPABET" (/ˈɑːrpəˌbɛt/ AR-pə-beht).[17] The interface displayed parsed words with color-coding to indicate pronunciation certainty: green for words found in the existing pronunciation lookup table, blue for manually entered ARPABET pronunciations, and red for words where the pronunciation had to be algorithmically predicted.[51]
Later versions of 15.ai introduced multi-speaker capabilities. Rather than training separate models for each voice, 15.ai used a unified model that learned multiple voices simultaneously through speaker embeddings–learned numerical representations that captured each character's unique vocal characteristics.[6] Along with the emotional context conferred by DeepMoji, this neural network architecture enabled the model to learn shared patterns across different characters' emotional expressions and speaking styles, even when individual characters lacked examples of certain emotional contexts in their training data.[17]
The platform limited text input to 200 characters per generation, though users could create multiple clips for longer speech sequences.[52] The interface included technical metrics and graphs, which served to highlight the research aspect of the website. As of version v23 of 15.ai, the interface displayed comprehensive model analysis information, including word parsing results and emotional analysis data. The flow and generative adversarial network (GAN) hybrid vocoder and denoiser, introduced in an earlier version, was streamlined to remove manual parameter inputs.[8]
Reception
Critical reception
Critics described 15.ai as easy to use and generally able to convincingly replicate character voices, with occasional mixed results.[53] Natalie Clayton of PC Gamer wrote that SpongeBob SquarePants' voice was replicated well, but noted challenges in mimicking the Narrator from the The Stanley Parable: "the algorithm simply can't capture Kevan Brighting's whimsically droll intonation."[54] Similarly, Russian gaming website Rampaga reflected that GLaDOS performed exceptionally well since "her voice was originally created to simulate human speech by artificial intelligence," while the Narrator from The Stanley Parable was less convincing due to insufficient training data.[55] Zack Zwiezen of Kotaku reported that "[his] girlfriend was convinced it was a new voice line from GLaDOS' voice actor, Ellen McLain".[56] Calvin Rugona of gaming outlet Gamezo commented that "the ease of use of the tool is undoubtedly a major factor in its popularity as it means anyone with an internet connection can enter in their commands, request their favourite voice and then save it."[44] Taiwanese newspaper United Daily News also highlighted 15.ai's ability to recreate GLaDOS's mechanical voice, alongside its diverse range of character voice options.[57] Yahoo! News Taiwan reported that "GLaDOS in Portal can pronounce lines nearly perfectly", but also criticized that "there are still many imperfections, such as word limit and tone control, which are still a little weird in some words."[58] Chris Button of AI newsletter Byteside called the ability to clone a voice with only 15 seconds of data "freaky" but also called tech behind it "impressive".[59] The platform's voice generation capabilities were regularly featured on Equestria Daily, a fandom news site dedicated to the show My Little Pony: Friendship Is Magic and its other generations, with documented updates, fan creations, and additions of new character voices.[60] In a post introducing new character additions to 15.ai, Equestria Daily's founder Shaun Scotellaro—also known by his online moniker "Sethisto"—wrote that "some of [the voices] aren't great due to the lack of samples to draw from, but many are really impressive still anyway."[42] Chinese My Little Pony fan site EquestriaCN also documented 15.ai's development, highlighting its various updates, though they criticized some of the bugs and the long queue wait times of the application.[61]
Multiple other critics also found the word count limit, prosody options, and English-only nature of the application as not entirely satisfactory.[62][58] Peter Paltridge of anime and superhero news outlet Anime Superhero News opined that "voice synthesis has evolved to the point where the more expensive efforts are nearly indistinguishable from actual human speech," but also noted that "In some ways, SAM is still more advanced than this. It was possible to affect SAM’s inflections by using special characters, as well as change his pitch at will. With 15.ai, you’re at the mercy of whatever random inflections you get."[63] Conversely, Lauren Morton of Rock, Paper, Shotgun praised the depth of pronunciation control—"if you're willing to get into the nitty gritty of it".[64] Similarly, Eugenio Moto of Spanish news website Qore.com wrote that "the most experienced [users] can change parameters like the stress or the tone."[65] Takayuki Furushima of Den Fami Nico Gamer highlighted the "smooth pronunciations", and Yuki Kurosawa of AUTOMATON noted its "rich emotional expression" as a major feature; both Japanese authors noted the lack of Japanese-language support.[66][51] Renan do Prado of Brazilian gaming news outlet Arkade and José Villalobos of Spanish gaming outlet LaPS4 pointed out that while users could create amusing results in Portuguese and Spanish respectively, the generation performed best in English.[67] Chinese gaming news outlet GamerSky called the app "interesting", but also criticized the word count limit of the text and the lack of intonations.[62] South Korean video game outlet Zuntata wrote that "the surprising thing about 15.ai is that [for some characters], there's only about 30 seconds of data, but it achieves pronunciation accuracy close to 100%".[68] Machine learning professor Yongqiang Li wrote in his blog that he was surprised to see that the application was free.[69] Marco Cocomello of South African gaming and pop culture website GLITCHED remarked that despite the 200-character limitation, the results "blew [him] away" when testing the app with GLaDOS's voice.[52]
Technical publications and outlets focusing on artificial intelligence provided more in-depth analysis of 15.ai's capabilities and limitations compared to other text-to-speech technologies of the time.[70] Rionaldi Chandraseta of AI newsletter Towards Data Science observed that "characters with large training data produce more natural dialogues with clearer inflections and pauses between words, especially for longer sentences."[50] Chinese tech and AI media outlet XinZhiYuan on QQ News highlighted the technical achievement of 15.ai's high-quality output (44.1 kHz sampling rate) despite using minimal training data, remarking that this was of significantly higher quality than typical deep learning text-to-speech implementations which used 16 kHz sampling rates. The outlet also acknowledged that while some pronunciation errors occurred due to the limited training data, this was understandable given that traditional deep learning models typically required 40 or more hours of training data.[13] Similarly, Parth Mahendra of AI newsletter AI Daily observed that while the system "does a good job at accurately replicating most basic words," it struggled with more complex terms, noting that characters would "absolutely butcher the pronunciation" of certain words.[18] Augusto Dueñas of the technology blog TheLinuxCode noted how traditional text-to-speech systems had long relied on stitching together recordings, producing monotone and robotic output that lacked the fluidity of human voices. In contrast, they characterized 15.ai's ability to replicate distinctive fictional voices, "from SpongeBob's energetic squeak to GlaDOS's passive-aggressive sarcasm" as "groundbreaking."[2] Ji Yunyo of Chinese tech news website NetEase News called the technology behind 15.ai "remarkably efficient," requiring only minimal data to accurately clone numerous voices while maintaining emotional nuance and natural intonation. However, he also pointed out limitations, noting that the emotional expression was relatively "neutral" and that "extreme" emotions couldn't be properly synthesized, making it less suitable for certain applications. Ji also mentioned that while many deepfake videos required creators to extract and edit material from hours of original content for very short results, 15.ai could achieve similar or better effects with only a few dozen minutes of training data per character, though server performance issues often meant synthesis could take over a minute to complete.[71]
Reactions from voice actors of featured characters

Some voice actors whose characters appeared on 15.ai have publicly shared their thoughts about the platform. In a 2021 interview on video game voice acting podcast The VŌC, John Patrick Lowrie—who voices the Sniper in Team Fortress 2—explained that he had discovered 15.ai when a prospective intern showed him a skit she had created using AI-generated voices of the Sniper and the Spy from Team Fortress 2. Lowrie commented:
"The technology still has a long way to go before you really believe that these are just human beings, but I was impressed by how much [15.ai] could do. You certainly don't get the delivery that you get from an actual person who's analyzed the scene, [...] but I do think that as a fan source—for people wanting to put together mods and stuff like that—that it could be fun for fans to use the voice of characters they like."
He drew an analogy to synthesized music, adding:
"If you want the sound of a choir, and you want the sound of an orchestra, and you have the money, you hire a choir and an orchestra. And if you don't have the money, you have something that sounds pretty nice; but it's not the same as a choir and an orchestra."[video 1]
In a 2021 live broadcast on his Twitch channel, Nathan Vetterlein—the voice actor of the Scout from Team Fortress 2—listened to an AI recreation of his character's voice. He described the impression as "interesting" and noted that "there's some stuff in there."[video 2]
Ethical concerns
Other voice actors had mixed reactions to 15.ai's capabilities. While some industry professionals acknowledged the technical innovation, others raised concerns about the technology's implications for their profession.[72] When voice actor Troy Baker announced his partnership with Voiceverse NFT, which had misappropriated 15.ai's technology, it sparked widespread controversy within the voice acting industry.[73] Critics raised concerns about automated voice acting's potential reduction of employment opportunities for voice actors, risk of voice impersonation, and potential misuse in explicit content.[74] Ruby Innes of Kotaku Australia noted, "this practice could potentially put voice actors out of work considering you could just use their AI voice rather than getting them to voice act for a project and paying them."[75] In her coverage of the Voiceverse controversy, Edie WK of Checkpoint Gaming raised the concern that "this kind of technology has the potential to push voice actors out of work if it becomes easier and cheaper to use AI voices instead of working with the actor directly."[76]
While 15.ai limited its scope to fictional characters and did not reproduce voices of real people or celebrities,[74] computer scientist Andrew Ng noted that similar technology could be used to do so, including for nefarious purposes.[77] In his 2020 assessment of 15.ai, he wrote:
"Voice cloning could be enormously productive. In Hollywood, it could revolutionize the use of virtual actors. In cartoons and audiobooks, it could enable voice actors to participate in many more productions. In online education, kids might pay more attention to lessons delivered by the voices of favorite personalities. And how many YouTube how-to video producers would love to have a synthetic Morgan Freeman narrate their scripts?
While discussing potential risks, he added:
"...but synthesizing a human actor's voice without consent is arguably unethical and possibly illegal. And this technology will be catnip for deepfakers, who could scrape recordings from social networks to impersonate private individuals."[77]
Legacy

15.ai was an early pioneer of audio deepfakes, leading to the emergence of AI speech synthesis-based memes during the initial stages of the AI boom in 2020.[78][79] 15.ai is credited as the first mainstream platform to popularize AI voice cloning in Internet memes and content creation,[1] particularly through its ability to generate convincing character voices in real-time without requiring extensive technical expertise.[80] The platform's impact was especially notable in fan communities, including the My Little Pony: Friendship Is Magic, Portal, Team Fortress 2, and SpongeBob SquarePants fandoms, where it enabled the creation of viral content that garnered millions of views across social media platforms like Twitter and YouTube.[81] Team Fortress 2 content creators also used the platform to produce both short-form memes and complex narrative animations using Source Filmmaker.[82] Fan creations included skits and new fan animations[83] (such as the popular Team Fortress 2 Source Filmmaker video Spy's Confession[8]), crossover content—such as Game Informer writer Liana Ruppert's demonstration combining Portal and Mass Effect dialogue in her coverage of the platform[84]—recreations of viral videos (including the infamous Big Bill Hell's Cars car dealership parody[85]), adaptations of fanfiction using AI-generated character voices (such as The Tax Breaks, a fully voiced 17-minute fan-made episode of Friendship Is Magic),[86] music videos and new musical compositions—such as the Team Fortress 2 remix Pootis Hardbass[8] and the explicit Pony Zone music video series[87]—and content where characters recited sea shanties.[88] Some fan creations gained mainstream attention, such as a viral edit replacing Donald Trump's cameo in Home Alone 2: Lost in New York with the Heavy Weapons Guy's AI-generated voice, which was featured on a daytime CNN segment in January 2021.[89][90] Some users integrated 15.ai's voice synthesis with VoiceAttack, a voice command software, to create personal assistants.[47]

Its influence has been noted in the years after it became defunct,[91] with several commercial alternatives emerging to fill the void, such as ElevenLabs[b] and Speechify.[93] Contemporary generative voice AI companies have acknowledged 15.ai's pioneering role. Y Combinator startup PlayHT called the debut of 15.ai "a breakthrough in the field of text-to-speech (TTS) and speech synthesis".[94] Cliff Weitzman, the founder and CEO of Speechify, credited 15.ai for "making AI voice cloning popular for content creation by being the first [...] to feature popular existing characters from fandoms".[95] Mati Staniszewski, co-founder and CEO of ElevenLabs, wrote that 15.ai was transformative in the field of AI text-to-speech.[96]
Prior to its shutdown, 15.ai established several technical precedents that influenced subsequent developments in AI voice synthesis. Its integration of DeepMoji for emotional analysis demonstrated the viability of incorporating sentiment-aware speech generation, while its support for ARPABET phonetic transcriptions set a standard for precise pronunciation control in public-facing voice synthesis tools.[6] The platform's unified multi-speaker model, which enabled simultaneous training of diverse character voices, proved particularly influential. This approach allowed the system to recognize emotional patterns across different voices even when certain emotions were absent from individual character training sets; for example, if one character had examples of joyful speech but no angry examples, while another had angry but no joyful samples, the system could learn to generate both emotions for both characters by understanding the common patterns of how emotions affect speech.[17]
15.ai also made a key contribution in reducing training data requirements for speech synthesis. Earlier systems like Google AI's Tacotron and Microsoft Research's FastSpeech required tens of hours of audio to produce acceptable results and failed to generate intelligible speech with less than 24 minutes of training data.[3][97] In contrast, 15.ai demonstrated the ability to generate speech with substantially less training data—specifically, the name "15.ai" refers to the creator's claim that a voice could be cloned with just 15 seconds of data.[98] This approach to data efficiency influenced subsequent developments in AI voice synthesis technology, as the 15-second benchmark became a reference point for subsequent voice synthesis systems. The original claim that only 15 seconds of data is required to clone a human's voice was corroborated by OpenAI in 2024.[99]
See also
- AI boom
- Character.ai
- Deepfake
- Ethics of artificial intelligence
- WaveNet
- My Little Pony: Friendship Is Magic fandom
Explanatory footnotes
- ^ The term "faster than real-time" in speech synthesis means that the system can generate audio more quickly than the actual duration of the speech—for example, generating 10 seconds of speech in less than 10 seconds would be considered faster than real-time.
- ^ which uses "11.ai" as a legal byname for its web domain[92]
References
Notes
- ^ a b Speechify 2024; Temitope 2024; Anirudh VK 2023; Wright 2023; Abisola 2025.
- ^ a b Dueñas 2023.
- ^ a b Google 2018
- ^ Kong 2020.
- ^ Kim 2020.
- ^ a b c d e f g h i j k l m Temitope 2024.
- ^ Hacker News 2022
- ^ a b c d e f g Abisola 2025.
- ^ Chandraseta 2021; Temitope 2024.
- ^ Chandraseta 2021; Li 2021; Abisola 2025.
- ^ Chandraseta 2021; Phillips 2022b; Temitope 2024; Abisola 2025.
- ^ QQ 2020; Ng 2020.
- ^ a b c QQ 2020.
- ^ Kurosawa 2021; Beckwith 2022.
- ^ "About". 15.ai (Official website). March 2, 2020. Archived from the original on March 3, 2020. Retrieved December 23, 2024.
- ^ Scotellaro 2020a; Scotellaro 2020b; www.equestriacn.com 2021.
- ^ a b c d Kurosawa 2021; Temitope 2024.
- ^ a b c Mahendra 2020.
- ^ Zwiezen 2021; Clayton 2021; Ruppert 2021; Morton 2021; Kurosawa 2021; Yoshiyuki 2021; Rugona 2021; Abisola 2025.
- ^ Lawrence 2022; Williams 2022; Wright 2022; Carcasole 2022; Innes 2022; Toh 2022; Khavanekar 2022; Beckwith 2022; LevelUp 2022; Muropaketti 2022; Anikó 2022; Groth-Anderson 2022; Aktaş 2022; Cabibi-Wilkin 2022; Temitope 2024.
- ^ Lopez 2022; Innes 2022; Anikó 2022.
- ^ Phillips 2022b; Lopez 2022; Carcasole 2022; Innes 2022; Toh 2022; Khavanekar 2022; Kar 2022; W-K 2022; Muropaketti 2022; Anikó 2022; Aktaş 2022.
- ^ Toh 2022; Kar 2022; Anikó 2022.
- ^ Коэн 2022; Aktaş 2022.
- ^ Wright 2022; Phillips 2022b; Коэн 2022; Groth-Anderson 2022.
- ^ Beckwith 2022.
- ^ W-K 2022; White 2022.
- ^ Innes 2022; Kar 2022; W-K 2022.
- ^ Lawrence 2022; Williams 2022; White 2022; Carcasole 2022.
- ^ Lawrence 2022; Williams 2022; White 2022.
- ^ Коэн 2022.
- ^ Lam 2022.
- ^ Cabibi-Wilkin 2022; White 2022.
- ^ White 2022.
- ^ Khibchenko 2022.
- ^ Paul Lehrman and Linnea Sage, et al. v. LOVO, Inc., No. 1:24-cv-03770, 38 (S.D.N.Y. 2024) ("Separately, VoiceVerse has already been found to have stolen technology from another company. See Ule Lopez, WCCF Tech, “Voiceverse NFT Service Reportedly Uses Stolen Technology from 15ai,” (Jan. 16, 2022), https://wccftech.com/voiceverse-nft-service-usesstolen-technology-from-15ai/.").
- ^ ElevenLabs 2024a; Play.ht 2024.
- ^ Rugona 2021; Williams 2022.
- ^ Rugona 2021; Phillips 2022b.
- ^ Chandraseta 2021; Menor 2024.
- ^ Zwiezen 2021; Clayton 2021; Morton 2021; Ruppert 2021; Villalobos 2021; Yoshiyuki 2021; Kurosawa 2021; Cocomello 2021; Rugona 2021.
- ^ a b Scotellaro 2020b.
- ^ Morton 2021; 遊戲 2021; Rugona 2021.
- ^ a b Rugona 2021.
- ^ QQ 2020; Chandraseta 2021.
- ^ Chandraseta 2021; Ji 2021.
- ^ a b Yoshiyuki 2021.
- ^ Kurosawa 2021; Chandraseta 2021.
- ^ Knight 2017.
- ^ a b c Chandraseta 2021.
- ^ a b c d Kurosawa 2021.
- ^ a b Cocomello 2021.
- ^ Clayton 2021; Ruppert 2021; Moto 2021; Scotellaro 2020c; Villalobos 2021; Cocomello 2021; Rugona 2021.
- ^ Clayton 2021.
- ^ Oxton 2021.
- ^ Zwiezen 2021.
- ^ 遊戲 2021.
- ^ a b MrSun 2021.
- ^ Button 2021.
- ^ Scotellaro 2020a; Scotellaro 2020b; Scotellaro 2020c; Scotellaro 2020d; Scotellaro 2020e; Scotellaro 2020f.
- ^ www.equestriacn.com 2021.
- ^ a b GamerSky 2021.
- ^ Paltridge 2021.
- ^ Morton 2021.
- ^ Moto 2021.
- ^ Yoshiyuki 2021: 日本語入力には対応していないが、ローマ字入力でもなんとなくそれっぽい発音になる。; 15.aiはテキスト読み上げサービスだが、特筆すべきはそのなめらかな発音と、ゲームに登場するキャラクター音声を再現している点だ。 (transl. It does not support Japanese input, but even if you input using romaji, it will somehow give you a similar pronunciation.; 15.ai is a text-to-speech service, but what makes it particularly noteworthy is its smooth pronunciation and the fact that it reproduces the voices of characters that appear in games.)
- ^ do Prado 2021; Villalobos 2021.
- ^ zuntata.tistory.com 2021.
- ^ Li 2021.
- ^ QQ 2020; Mahendra 2020; Dueñas 2023; Ji 2021.
- ^ Ji 2021.
- ^ Innes 2022; Kar 2022; Temitope 2024.
- ^ Lawrence 2022; Wright 2022.
- ^ a b Menor 2024.
- ^ Innes 2022.
- ^ W-K 2022.
- ^ a b Ng 2020.
- ^ MrSun 2021: 大家是否都曾經想像過,假如能讓自己喜歡的遊戲或是動畫角色說出自己想聽的話,不論是名字、惡搞或是經典名言,都是不少人的夢想吧。不過來到 2021 年,現在這種夢想不再是想想而已,因為有一個網站通過 AI 生成的技術,讓大家可以讓不少遊戲或是動畫角色,說出任何你想要他們講出的東西,而且相似度與音調都有相當高的準確度 (transl. Have you ever imagined what it would be like if your favorite game or anime characters could say exactly what you want to hear? Whether it's names, parodies, or classic quotes, this is a dream for many. However, as we enter 2021, this dream is no longer just a fantasy, because there is a website that uses AI-generated technology, allowing users to make various game and anime characters say anything they want with impressive accuracy in both similarity and tone).
- ^ LOVO 2021; Anirudh VK 2023.
- ^ Ruppert 2021; Morton 2021.
- ^ Scotellaro 2020c; 遊戲 2021; Kurosawa 2021; Morton 2021; Temitope 2024.
- ^ Clayton 2021; Zwiezen 2021; Morton 2021.
- ^ Morton 2021; Kurosawa 2021.
- ^ Ruppert 2021.
- ^ Zwiezen 2021; Morton 2021.
- ^ Scotellaro 2020d; Abisola 2025.
- ^ Scotellaro 2020e; Abisola 2025.
- ^ Zwiezen 2021; Ruppert 2021.
- ^ Clayton 2021; CNN 2021.
- ^ Actor applauds editing out Trump's 'Home Alone 2' cameo (Report). CNN. January 15, 2021. Retrieved January 21, 2025.
- ^ Wright 2023.
- ^ ElevenLabs 2024b.
- ^ ElevenLabs 2024a; Play.ht 2024; Speechify 2024.
- ^ Play.ht 2024.
- ^ Speechify 2024.
- ^ ElevenLabs 2024a.
- ^ Ren 2019.
- ^ Chandraseta 2021; Button 2021; Temitope 2024.
- ^ OpenAI 2024; Temitope 2024.
Tweets
- ^ @fifteenai (December 7, 2024). "The past and future of 15.ai" (Tweet). Archived from the original on December 8, 2024. Retrieved December 19, 2024 – via Twitter.
- ^ Baker, Troy [@TroyBakerVA] (January 14, 2022). "I'm partnering with @VoiceverseNFT to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create. We all have a story to tell. You can hate. Or you can create. What'll it be? https://t.co/cfDGi4q0AZ" (Tweet). Archived from the original on September 16, 2022. Retrieved December 7, 2022 – via Twitter.
Videos
- ^ The VŌC Podcast // John Patrick Lowrie & Ellen McLain Interview (The voices of GLaDOS and Sniper) (Podcast). The VŌC Podcast. April 11, 2021. Retrieved January 15, 2025.
- ^ "Nate listens to his AI self". Twitch. Retrieved January 21, 2025.
Works cited
- Abisola, Shojobi (January 3, 2025). "The MIT Project That Paved Way For Modern Voice AI". Independent. Archived from the original on February 27, 2025. Retrieved February 27, 2025.
- Aktaş, Utku (January 19, 2022). "Troy Baker-backed NFT firm admitted using voice lines from another service without permission". Mobidictum. Archived from the original on June 14, 2024. Retrieved March 1, 2025.
- Anikó, Angyal (January 17, 2022). "Voiceverse NFT Uses Stolen Technology!". theGeek. Archived from the original on September 28, 2023. Retrieved March 1, 2025.
- Beckwith, Michael (January 17, 2022). "NFT firm Voiceverse admits it stole work after announcing Troy Baker deal". Metro. Retrieved February 28, 2025.
- Button, Chris (January 19, 2021). "Make GLaDOS, SpongeBob and other friends say what you want with this AI text-to-speech tool". Byteside. Archived from the original on June 25, 2024. Retrieved December 18, 2024.
- Cabibi-Wilkin, Lily (January 26, 2022). "NFTs Are Bad. So Why Do People Keep Making Them?" (PDF). The Herald. Jonesboro, Arkansas: Arkansas State University. p. 2A. Archived (PDF) from the original on July 6, 2024. Retrieved March 4, 2025.
Baker announced a collaboration with VoiceVerse, which promised an AI-based voiceover and text-to-speech platform using celebrity voices. Purchasing their NFTs with Ethereum "provides you an ownership to a unique voice in the Metaverse." After a wave of backlash (especially after the "demo" VoiceVerse shared on their Twitter turned out to be made with 15.ai, an already existing AI text-to-speech program which has no connection to Voiceverse), Baker stated that he had "a lot to think about" and encouraged his followers to "resume the conversation."
- Carcasole, David (January 17, 2022). "Troy Baker's NFT Partner Company Caught Claiming Voice Lines From Another Service As Their Own". PlayStation Universe. Archived from the original on October 6, 2022. Retrieved February 28, 2025.
- Chandraseta, Rionaldi (January 21, 2021). "Generate Your Favourite Characters' Voice Lines using Machine Learning". Towards Data Science. Archived from the original on January 21, 2021. Retrieved December 18, 2024.
- Clayton, Natalie (January 19, 2021). "Make the cast of TF2 recite old memes with this AI text-to-speech tool". PC Gamer. Archived from the original on January 19, 2021. Retrieved December 18, 2024.
- Cocomello, Marco (January 20, 2021). "Make Portal's GLaDOS and Other Characters Say Whatever You Want With This New App". GLITCHED. Retrieved March 10, 2025.
The full list of 15.ai character is quite large. Not only can you replicate something from Portal but there's SpongeBob Squarepants, My Little Pony, Daria and many more. Using the app is also fairly simple. You select the source (the game or series you want) and then the character from the source who you want to replicate. You can then enter anything you wish under 200 characters in the dialogue box and a few seconds later, thanks to advanced algorithms and stuff I don't understand, the voice line will be ready. Users can then download the sound snippet for safe-keeping.
- "CNN Newsroom". CNN. January 15, 2021.
- Dueñas, Augusto (December 27, 2023). "Demystifying 15.ai: How AI Generates Ultra-Realistic Text-to-Speech Voices". TheLinuxCode. Retrieved February 22, 2025.
- do Prado, Renan (January 19, 2021). "Faça GLaDOS, Bob Esponja e outros personagens falarem textos escritos por você!" [Make GLaDOS, SpongeBob and other characters speak texts written by you!]. Arkade (in Brazilian Portuguese). Archived from the original on August 19, 2022. Retrieved December 22, 2024.
Obviamente o programa funciona no idioma inglês, mas dá pra gerar umas frases bem emboladas e engraças em português, estilo aqueles memes usando vozes em outros idiomas falando em português.
[Obviously, the program works in English, but you can generate some really confusing and funny sentences in Portuguese, like those memes using voices in other languages speaking Portuguese.] - Staniszewski, Mati (2024a). "15.AI: Everything You Need to Know & Best Alternatives". ElevenLabs (Official website). Archived from the original on December 25, 2024. Retrieved December 18, 2024.
Combining speech synthesis with machine learning, deep learning, deep neural networks, and audio synthesis algorithms, 15.ai transformed how users created different voices with AI text.
- "Can I publish the content I generate on the platform?". ElevenLabs (Official website). 2024b. Archived from the original on December 23, 2024. Retrieved December 23, 2024.
- "15.ai已经重新上线,版本更新至v23" [15.ai has been re-launched, version updated to v23]. EquestriaCN (in Chinese). October 1, 2021. Archived from the original on May 19, 2024. Retrieved December 22, 2024.
目前重新上线的网站仍然存在一些bug,且排队较长,需要一定的耐心等待与重试。
[Currently, the relaunched website still has some bugs and the queue is long, so you need to be patient and wait and try again.] - Feng, Bai (March 15, 2020). "模型参数过亿跑不动?看MIT小哥,少量数据完成高质量文本转语音!" [Model has over 100 million parameters and won't run? Check out this MIT guy who achieves high-quality text-to-speech with minimal data!]. QQ News (in Chinese). Archived from the original on February 27, 2025. Retrieved February 22, 2025.
- "这个网站可用AI生成语音 让ACG角色"说"出你输入的文本" [This Website Can Use AI to Generate Voice, Making ACG Characters "Say" the Text You Input]. GamerSky (in Chinese). January 18, 2021. Archived from the original on December 11, 2024. Retrieved December 18, 2024.
笔者就尝试了一下让紫悦咏唱"无限剑制"的实验,虽然AI的声音缺少了些抑扬顿挫,不过效果也还算有趣。
[Although the AI's voice lacks some intonation, the effect is still interesting. Currently, 15.ai provides relatively few character options. Due to the word limit of the text, the generated voice is relatively short.] - "Audio samples from "Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis"". August 30, 2018. Archived from the original on November 11, 2020. Retrieved June 5, 2022.
- Groth-Anderson, Magnus (January 19, 2022). "Troy Baker-støttet NFT-virksomhed indrømmer at have stjålet indhold" [Troy Baker-backed NFT company admits to stealing content]. Gamereactor (in Danish). Gamereactor. Archived from the original on March 1, 2025. Retrieved March 1, 2025.
Og til det svarede 15.ai: 'Go f*ck yourself'.
[And to that, 15.ai responded: 'Go f*ck yourself.'] - "15.ai". Hacker News. June 12, 2022. Retrieved December 29, 2024.
- Innes, Ruby (January 18, 2022). "Voiceverse Is The Latest NFT Company Caught Using Someone Else's Content". Kotaku Australia. Archived from the original on July 26, 2024. Retrieved February 28, 2025.
Another criticism was that this practice could potentially put voice actors out of work considering you could just use their AI voice rather than getting them to voice act for a project and paying them.
- Ji, Yunyo (January 19, 2021). "这个国外的语音合成网站,可以让玩家操控二次元角色说话" [This foreign speech synthesis website allows players to control the two-dimensional characters to speak]. 163.com (in Chinese). NetEase News. Archived from the original on February 27, 2025. Retrieved February 26, 2025.
- Kar, Jess (January 18, 2022). "Troy Baker's Partner Voice NFT Platform Voiceverse Admits to Stealing Audio". NFTGators. Archived from the original on September 15, 2024. Retrieved February 28, 2025.
- Khavanekar, Aditya (January 18, 2022). "The NFT company that Troy Baker markets took audio clips from another service". GamingSym.in. Archived from the original on January 19, 2022. Retrieved February 28, 2025.
- Khibchenko, Pavel (February 15, 2022). "NFT в геймдеве: проблемы регулирования, гнев игроков и поспешные решения разработчиков". Skillbox (in Russian). Archived from the original on June 5, 2023. Retrieved February 28, 2025.
- Kim, Jaehyeon (2020). "Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search". arXiv:2005.11129 [eess.AS].
- Knight, Will (August 3, 2017). "An Algorithm Trained on Emoji Knows When You're Being Sarcastic on Twitter". MIT Technology Review. Archived from the original on June 2, 2022. Retrieved December 18, 2024.
- Kong, Jungil (2020). "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis". arXiv:2010.05646 [cs.SD].
- Kurosawa, Yuki (2021). "ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる" [Game Character Voice Reading Software "15.ai" Now Available. Get Characters from Undertale and Portal to Say Your Desired Lines]. AUTOMATON (in Japanese). Archived from the original on January 19, 2021. Retrieved December 18, 2024.
英語版ボイスのみなので注意。;もうひとつ15.aiの大きな特徴として挙げられるのが、豊かな感情表現だ。
[Please note that only English voices are available.;Another major feature of 15.ai is its rich emotional expression.] - Lam, Khoa (January 14, 2022). "Incident 277: Voices Created Using Publicly Available App Stolen and Resold as NFT without Attribution". AI Incident Database. Archived from the original on January 13, 2025. Retrieved February 27, 2025.
An AI-synthetic audio sold as an NFT on Voiceverse's platform was acknowledged by the company for having been created by 15.ai, a free web app specializing in text-to-speech and AI-voice generation, and reused without proper attribution.
- Lawrence, Briana (January 19, 2022). "Shonen Jump Scare Leads to Company Reassuring Fans That They Aren't Getting Into NFTs". The Mary Sue. Archived from the original on January 13, 2025. Retrieved December 23, 2024.
- "The disappointment, brother! NFT project supported by Troy Baker used third-party technology". LevelUp.com. LevelUp. January 18, 2022. Archived from the original on May 21, 2022. Retrieved February 28, 2025.
- Li, Yongqiang (2021). "语音开源项目优选:免费配音网站15.ai" [Voice Open Source Project Selection: Free Voice Acting Website 15.ai]. Zhihu (in Chinese). Archived from the original on December 19, 2024. Retrieved December 18, 2024.
- Lopez, Ule (January 16, 2022). "Voiceverse NFT Service Reportedly Uses Stolen Technology from 15ai [UPDATE]". Wccftech. Archived from the original on January 16, 2022. Retrieved June 7, 2022.
- LOVO.AI. "The Road to General Intelligence in Synthetic Speech (Part 1)". Archived from the original on September 10, 2024. Retrieved March 1, 2025.
And of course, there is 15.ai providing the ability to create voice memes using internet's favorite cartoon and game characters.
- Mahendra, Parth (May 11, 2020). "Spongebob Can Now Narrate Your Writing". AI Daily. Archived from the original on July 1, 2021. Retrieved March 4, 2025.
- Menor, Deion (November 7, 2024). "15.ai – Natural and Emotional Text-to-Speech Using Neural Networks". HashDork. Archived from the original on February 23, 2025. Retrieved January 3, 2025.
- Morton, Lauren (January 18, 2021). "Put words in game characters' mouths with this fascinating text to speech tool". Rock, Paper, Shotgun. Archived from the original on January 18, 2021. Retrieved December 18, 2024.
- Moto, Eugenio (January 20, 2021). "15.ai, el sitio que te permite usar voces de personajes populares para que digan lo que quieras". Qore (in Spanish). Archived from the original on December 28, 2024. Retrieved December 21, 2024.
Incluso, los más clavados pueden cambiar algunos parámetros como la intencionalidad o el tono.
[The most experienced can change some parameters such as the intention or the tone.] - MrSun (January 19, 2021). "讓你喜愛的ACG角色說出任何話! AI生成技術幫助你實現夢想" [Let your favorite ACG characters say anything! AI generation technology helps you realize your dreams]. Yahoo (in Chinese). Archived from the original on December 28, 2024. Retrieved December 22, 2024.
- "Troy Bakerin tukema NFT-yhtiö kärähti – Kaupitteli ääninäyttelyä luvatta" [Troy Baker-backed NFT company sued – peddling voice acting without permission]. Muropaketti (in Finnish). January 17, 2022. Archived from the original on May 25, 2022. Retrieved March 1, 2025.
- Ng, Andrew (April 1, 2020). "Voice Cloning for the Masses". DeepLearning.AI. Archived from the original on December 28, 2024. Retrieved December 22, 2024.
- "Navigating the Challenges and Opportunities of Synthetic Voices". OpenAI. March 9, 2024. Archived from the original on November 25, 2024. Retrieved December 18, 2024.
- Oxton (January 20, 2021). "При помощи AI можно озвучить набранный текст голосами GlaDOS, Мисс Полинг и других персонажей игр" [Using AI, you can voice typed text with the voices of GlaDOS, Miss Pauling and other game characters]. Rampaga (in Russian). Archived from the original on February 26, 2021. Retrieved March 22, 2025.
Некоторые персонажи при этом звучат лучше других. Лучше всех с задачей справляется GLaDOS, так как её голос изначально создавался с целью симуляции человеческой речи искусственным интеллектом. Увы, но всех диалогов Рассказчика из The Stanley Parable недостаточно, чтобы AI научился подражать причудливой интонации Кевана Брайтинга.
[Some characters sound better than others. GLaDOS is the best at this task, as her voice was originally created with the goal of simulating human speech by an artificial intelligence. Unfortunately, all the dialogue from The Stanley Parable's Narrator isn't enough for the AI to learn to imitate Kevan Brighting's quirky intonation.] - Ruppert, Liana (January 18, 2021). "Make Portal's GLaDOS And Other Beloved Characters Say The Weirdest Things With This App". Game Informer. Archived from the original on January 18, 2021. Retrieved December 18, 2024.
- Paltridge, Peter (January 18, 2021). "This Website Will Say Whatever You Type In Spongebob's Voice". Anime Superhero News. Archived from the original on October 17, 2021. Retrieved December 22, 2024.
In some ways, SAM is still more advanced than this. It was possible to affect SAM's inflections by using special characters, as well as change his pitch at will. With 15.ai, you're at the mercy of whatever random inflections you get. It doesn't seem to know what to do with question marks (SAM did) and delivers each line as a statement.
- Phillips, Tom (January 17, 2022). "Troy Baker-backed NFT firm admits using voice lines taken from another service without permission". Eurogamer. Archived from the original on January 17, 2022. Retrieved December 31, 2024.
- "Everything You Need to Know About 15.ai: The AI Voice Generator". Play.ht. September 12, 2024. Archived from the original on December 25, 2024. Retrieved December 18, 2024.
15.ai was a breakthrough in the field of text-to-speech (TTS) and speech synthesis, offering high-quality and emotive voices that captivated users across various platforms, especially content creators.
- Rugona, Calvin (January 22, 2021). "Make GLaDos And Other Characters Say What You Want". Gamezo. Archived from the original on March 21, 2025. Retrieved March 22, 2025.
- Ren, Yi (2019). "FastSpeech: Fast, Robust and Controllable Text to Speech". arXiv:1905.09263 [cs.CL].
- "Free 15.ai Character Voice Cloning and Alternatives". Resemble.ai. October 17, 2024. Retrieved December 31, 2024.
- Scotellaro, Shaun (2020a). "Rainbow Dash Voice Added to 15.ai". Equestria Daily. Archived from the original on December 1, 2024. Retrieved December 18, 2024.
- Scotellaro, Shaun (2020b). "15.ai Adds Tons of New Pony Voices". Equestria Daily. Archived from the original on December 26, 2024. Retrieved December 21, 2024.
- Scotellaro, Shaun (2020c). "Neat "Pony Preservation Project" Using Neural Networks to Create Pony Voices". Equestria Daily. Archived from the original on June 23, 2021. Retrieved December 18, 2024.
- Scotellaro, Shaun (2020d). "Full Simple Animated Episode - The Tax Breaks (Twilight)". Equestria Daily. Retrieved January 1, 2025.
15.ai has come a long way since we first posted it years ago. With that supreme technology and some clever editing, someone named Clipper Anon over on Youtube has put together an entire fully voiced 17 minute episode,
- Scotellaro, Shaun (2020e). "More Pony Music! We Shine Brighter Together!". Equestria Daily. Retrieved January 1, 2025.
- Scotellaro, Shaun (2020f). "New Among Us Animation Goes Viral... With Pony Voices". Equestria Daily. Retrieved January 1, 2025.
- Temitope, Yusuf (December 10, 2024). "15.ai Creator reveals journey from MIT Project to internet phenomenon". The Guardian. Archived from the original on December 28, 2024. Retrieved December 25, 2024.
- "게임 캐릭터 음성으로 영어를 읽어주는 소프트 15.ai 공개" [Software 15.ai Released That Reads English in Game Character Voices]. Tistory (in Korean). January 20, 2021. Archived from the original on December 20, 2024. Retrieved December 18, 2024.
- Toh, Brandon (January 18, 2022). "Troy Baker's NFT Partner Company Voiceverse Caught Using Voice Lines From Another Service Without Permission". Geek Culture. Archived from the original on November 30, 2022. Retrieved February 28, 2025.
- 遊戲, 遊戲角落 (January 20, 2021). "這個AI語音可以模仿《傳送門》GLaDOS講出任何對白!連《Undertale》都可以學" [This AI Voice Can Imitate Portal's GLaDOS Saying Any Dialog! It Can Even Learn Undertale]. United Daily News (in Chinese (Taiwan)). Archived from the original on December 19, 2024. Retrieved December 18, 2024.
- Villalobos, José (January 18, 2021). "Descubre 15.AI, un sitio web en el que podrás hacer que GlaDOS diga lo que quieras" [Discover 15.AI, a Website Where You Can Make GlaDOS Say What You Want]. LaPS4 (in Spanish). Archived from the original on January 18, 2021. Retrieved January 18, 2021.
La dirección es 15.AI y funciona tan fácil como parece.
[The address is 15.AI and it works as easy as it looks.] - Anirudh VK (March 18, 2023). "Deepfakes Are Elevating Meme Culture, But At What Cost?". Analytics India Magazine. Archived from the original on December 26, 2024. Retrieved December 18, 2024.
While AI voice memes have been around in some form since '15.ai' launched in 2020, [...]
- W-K, Edie (January 15, 2022). "Troy Baker angers the internet with NFT partnership". Checkpoint Gaming. Archived from the original on December 12, 2024. Retrieved February 28, 2025.
The concern here is that this kind of technology has the potential to push voice actors out of work if it becomes easier and cheaper to use AI voices instead of working with the actor directly. With the addition of NFTs into this mix, which is an environment ripe for scams, art theft, and fraud, many are questioning Baker's association with the project. As perhaps the highest-profile voice actor in gaming, it seems like a bit of a self-own.
- Weitzman, Cliff (November 19, 2023). "15.ai: All about 15.ai and the best alternative". Speechify. Archived from the original on December 25, 2024. Retrieved December 31, 2024.
- White, Molly (January 14, 2022). "Voice actor Troy Baker announces his involvement in "voice NFT" project Voiceverse with an antagonistic tweet, shortly before it's revealed that the project stole work". Web3 Is Going Just Great. Archived from the original on July 24, 2024. Retrieved February 28, 2025.
Voiceverse is pretty vague as to what it's actually offering, but it has something to do "provid[ing] you an ownership to a unique voice in the Metaverse". Baker's announcement tweet ended, "You can hate. Or you can create. What'll it be?", which didn't seem to help with the already-negative reaction to the idea. Things were further soured when it was revealed that Voiceverse had stolen work without crediting it from a computer-generated voice project called 15.ai.
- Williams, Demi (January 18, 2022). "Voiceverse NFT admits to taking voice lines from non-commercial service". NME. Archived from the original on January 18, 2022. Retrieved December 18, 2024.
- Wright, Steve (January 17, 2022). "Troy Baker-backed NFT company admits to using content without permission". Stevivor. Archived from the original on January 17, 2022. Retrieved December 18, 2024.
- Wright, Steven (March 21, 2023). "Why Biden, Trump, and Obama Arguing Over Video Games Is YouTube's New Obsession". Inverse. Archived from the original on December 20, 2024. Retrieved December 18, 2024.
AI voice tools used to create "audio deepfakes" have existed for years in one form or another, with 15.ai being a notable example.
- Yoshiyuki, Furushima (January 18, 2021). "『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に" [Portal's GLaDOS and UNDERTALE's Sans Will Read Text for You. "15.ai" Service Aims to Reproduce Even the Emotions in Text, Becomes Topic of Discussion]. Den Fami Nico Gamer (in Japanese). Archived from the original on January 18, 2021. Retrieved December 18, 2024.
- Zwiezen, Zack (January 18, 2021). "Website Lets You Make GLaDOS Say Whatever You Want". Kotaku. Archived from the original on January 17, 2021. Retrieved December 18, 2024.
- Коэн (January 15, 2022). "Создателей голосовых NFT, поддерживаемых Троем Бейкером, обвинили в воровстве голоса". Shazoo (in Russian). Archived from the original on January 23, 2022. Retrieved February 28, 2025.