I cloned my voice to make my audiobook with AI. It’s a controversial choice, and here’s why I made it.
I love audiobooks. I love listening to something engaging and learning while I clean, exercise, drive, or sit and cry in my driveway. Audiobooks are a fast-growing way of consuming long-form content, an important accessibility feature, and, I think, a good fit for non-fiction that isn’t overly technical, like Amplify Good Work.
It’s also really easy to make a bad audiobook with an dry narrator or poor audio quality, resulting in negative reviews, costly returns, and a negative perception that bleeds over to the book in general.
Committed to making a decent audio version, I set about learning about my options.
Audiobook narration options.
Realistically, you have 3, maybe 3 and a half options:
professional narrator,
narrate it yourself, or
AI narration.
With each option, there are questions of cost (in dollars and time), quality, distribution., and ethics.
Let’s start with distribution.
There are 3 really big players in the audiobook distribution space.
In the red corner, we have the old standby, the behemoth: Audible. Still the largest distributor of audiobooks and supported by the discoverability machine that is Amazon.com, like it or not, you must contend with Audible. They also have very restrictive policies, including a few that border on anti-competitive practices, which we will get to in a minute. Audible subscribers get about one book per month included in their subscription (through a credit system) or you can purchase an audiobook outright, usually for $10-20. Audible also distributes audiobooks for iTunes.
In the blue corner, we have Spotify: the young upstart. Offering audiobooks since 2023, Spotify is quickly gaining market share, and is the primary driver of the growth in audiobooks over the last 2 years. Spotify offers a limited amount of audiobook listening to all Spotify Premium subscribers, enough to get through at least one book a month in my experience. You can purchase additional audiobook time on top of that. Spotify is almost single handedly supporting my participation in book club these days.
In the middle, we have Libby. Also known as the library! Libby is free with a library card, and you can get as many audiobooks as you’d like, as far as I can tell. You do often have to wait for popular titles, because your library has a limited amount of licenses for each book. But renting from the library supports the library system, and it’s free. And I mean, come on. It’s the library! I have to be in the library.
Now, quality
Let’s get it out of the way right now, I am not a voice actor. Self-narrated audiobooks range widely in quality. Memoirs by actors, comedians, and other public figures are often much better than a hired narrator without the life experience, but honestly, a lot of untrained authors are monotone and difficult to listen to. I am not kidding myself here: I’m not on the high end of this scale.
Hired narrators offer high quality, and they often do the audiobook production themselves. They have studios, or other set ups to minimize extraneous noise, and professional-quality equipment. This is the gold standard of options for people who aren’t great narrators themselves. There are bad, inexperienced, and poorly-fit professional narrators out there, so this does take some discretion: you may have to pay a little more for a narrator that can read your book well.
Modern AI voices have improved A LOT from monotone screen readers. They are still no match for a great human narrator, especially when it comes to fiction or narrative nonfiction, where emotions are part of the job.
And cost
Here’s where it gets tough.
Narrating your own audiobook is free. Kind of. It’s free if your time is value-less to you. And you already have a nice microphone. And a suitable place to record. And you know how to use audio editing software, both to trim out your mistakes, and meet Audible’s technical requirements, which frankly, I don’t understand. And if you don’t have to hire an acting or voice coach to do a decent job. OK, maybe not all that free.
AI narration is overall the cheapest option. It takes a surprising amount of time: It took me probably 50-60 hours of dedicated attention, listening through and correcting the recording. (I’ll create another post about the experience of doing this! It is very very tedious, but certainly less time than learning how to record, edit, and master the files myself) I paid less than $200 for the generation “credits” on ElevenLabs. There are also substantial time costs to this, but much less than it would have taken to read all the words in order, (learn how to &) edit, and (learn how to &) master the files.
There’s a catch to this option, though! Audible will not let you use ElevenLabs files. They have their own special AI narration service, called Audible Virtual Voice (AVV). Don’t worry, you can’t use AVV files on Spotify or Libby either, so if you want to do AI narration and be available everywhere, you have to generate and edit AI audio files twice. Ugh.
Hiring a professional narrator outright costs a couple thousand dollars. I priced it out for AGW, and figured I’d spend about $4,000. This would require about 2000 sales at full price to recover the cost. But wait! To get that royalty rate, you have to agree not to sell your audiobook anywhere else. If you want to distribute to Spotify and Libby, you are dropping from 40% royalty to 25%. Not to mention, you are not getting 1000 or even 100 audiobook sales without a LOT of money invested in advertising. At this point, we are looking at 4k to 5k sales on audio alone just to break even.
Hiring a professional narrator outright is simply not a cost effective option for a new author without an existing fan base who will buy your audiobook. You may still want to do it because of a personal commitment or because you are very optimistic about your sales, but to me, it just doesn’t make sense.
Wait! There’s a secret additional option with professional narrators. The Audiobook Creation Exchange (ACX), a marketplace where you can hire narrators, has some narrators who work on a pure royalty-share basis. Instead of paying up front, you split royalties with the narrator 50/50 for 7 years. BUT ACX is run by Audible. Not only can. you not use ACX-narrated files outside of Audible, they have exclusive rights to distributing any audio version of your book for 7 years. Royalty share through ACX means no library distribution. (You agree that you “will neither produce or sell, nor grant any third party the right to produce or sell, an audio version of the Book in the Language(s) for distribution in the Territory during the exclusive production period”)
There once was a non-Audible alternative (Findaway Voices) but they were recently acquired, and the new company (INaudio) seems to have temporarily(?) shut down this service to focus on audiobook distribution. (I truly thought I was losing my mind, because I keep seeing references and links to Marketplace and Voices Share, but I finally confirmed with customer service that they do not offer any narration services.)
It seems to me that you would need an intermediary platform to track and split the royalties, otherwise, how could the narrator trust that the author is splitting royalties fairly? I am looking for an individual narrator who will do royalty share outside ACX, but I am not hopeful given what I’m finding so far.
That leaves our last consideration: ethics.
Ethics.
What values are threatened by audiobook narration?
I haven’t seen any objections to authors recording their own audiobooks or paying a professional narrator, but AI narration? Oh boy.
There are three major objections that I hear:
AI narration is taking away jobs from professional narrators (value: Good Jobs)
Cloning or training on professional’s voices without their permission is extractive, and so is the tool training on your voice (value: Ownership and Intellectual Property)
AI in general is bad for the environment. (value: Environmental Protection & Sustainability)
We can start with 3. As I’ve outlined in my chapter sustainability (available here for free!) the environmental costs come from data centers, not AI itself. Here, we are looking at resource use: water and electricity. Noting that the vast majority of water use is during training (sunk cost, not impacted by any individual author’s choice to use AI narration) and for electricity generation (clean hydropower), all of that same resource use is happening with cloud recording, editing, and distribution software as well. (I can probably find an editing option that lives on my local machine, but having lost an hour+ long recording on Quicktime before, cloud recording is critical, and distribution can only happen online). Given that I am an amateur at all three of those things and all of those tasks would take a long time, doing this myself would have had as big or bigger environmental impact as using AI, which does generation, editing, and mastering at once. This is not an easy argument to make to understandably concerned folks online, so I am not advocating that you go on a crusade about it: just clarifying why it didn’t play into my decision.
Ownership and Intellectual Property is an issue because ElevenLabs is not very specific about their training data. For voice cloning, it seems like they have a pre-trained model, and then you upload segments of your own speech to fine-tune it to sound like you. Users can decide to turn off the “Improve the model for everyone” setting so it doesn’t train on your voice, but what about the original training data? Speculation abounds, but there’s no confirmation. If they are scraping from others’ content without permission, it may legally be fair use, but there’s an understandable objection there. It does seem possible that there’s enough public domain audio to pre-train on, but we will never learn unless ElevenLabs deigns to tell us or it comes out in discovery in a lawsuit.
AI narration is, or probably will eventually, take some work from professional narrators. In particular, it threatens to take the projects for which people are on the fence: they have enough to pay a professional narrator, but they judge that AI is “good enough.” I am not one of these people, for what it’s worth. My choice was not “Use AI or pay a professional narrator.” It was “Use AI or don’t have an audiobook.” It is a fair enough point that boycotting all digitally narrated audiobooks discourages its use and pushes the people on the fence toward choosing a professional narrator: I put this in the category of “reasonable people can disagree.”
I’ve laid out how these values informed my choice, but regardless of how I see all these ethical concerns, it’s important for me (and other new authors in my position) to recognize that some people are going to object to digital narration regardless. Choosing AI narration means losing some listeners, and perhaps some readers who object that strongly, too.
My Decision
I would really like to have an audio version available upon release, and I am committed to having it available from the library. (I did give reading it myself a shot, but trust me when I say, nobody wants that.)
Ultimately, I decided to start with ElevenLabs to ensure that I can be available in the two most accessible audiobook venues (Libby and Spotify) right away. I am absolutely not going through and correcting AI audio files twice, so Audible and it’s lightly rude exclusivity policies will have to wait.
It helps that AI use is the topic of my book: it’s for people who are concerned, but open to using it, and people who want to work with these concerned folks. People who have made up their mind to be completely against AI are mostly not in my audience, so the loss of audience won’t hit me as hard as it would many other authors.
But I won’t stop here. I want an audio product available right away, but I will start looking for individual narrators who are willing to do royalty share. (If you have any recommendations for such narrators or where to find them, please reach out!)
Thanks for reading! I know this is a controversial choice and I am absolutely available for discussion about it. I spent a lot of time looking into it, but one thing I’ve realized is that there’s been a lot of change in this space: it’s difficult to make sure I am reading about current policies and narration options. Hopefully this article will find someone else in the same position and save them some leg work :) (I will update it if I find a non-exclusive royalty share option!)

