How to use Text-to-Speech in Kapwing

Image of Kit waving with the words "Hello world" in a text box and waveform

You've seen viral videos on TikTok and Instagram with an automated voice that reads text, and you want to know how it's done. On Kapwing, you can automatically add a synthetic voiceover to a project with the Text-to-Speech (TTS) feature! This tutorial will explain what this feature is and how to use it to allow for more stories to be created and shared.

What is Text-to-Speech?
How do you use Text-to-Speech?
Voice Cloning
What languages does Kapwing support for Text to Speech?


What is Text-to-Speech?

Text-to-Speech (TTS) allows users to generate audio from the text in their projects as well as lets them choose which default voice they want to use for the audio (eg. American male/female).

How do you use Text-to-Speech?

  1. Click the "AI Voice" tab in the left sidebar.
  2. Type in the text you want to make into audio. Please note that there is a 1000-character limit for text boxes you want to use Text to Speech on.
  3. Select the language and voice you want to use.
    Note: Business and Enterprise users can use their own voice clones. Learn more below.
  4. Leave the "person" icon to none if you just want audio only and not Personas.
  5. Click "Add layer".

Note: The audio will not generate any on screen text, but you can use our Subtitle tool to create subtitles for the audio.

Your text to speech limits will renew on the first of the month, regardless of your billing date. For example, if you have run out of limits by June 15th and your billing is June 20th, your limits will renew July 1st.

Voice Cloning

Kapwing offers the ability to save a clone of your voice so that you can create a text to speech layer using your own voice model. We've enabled Voice Cloning in partnership with Eleven Labs.

To add a voice clone, you must be a Business customer. Business plan customers can save up to 2 voice clones in their Brand Kit. Once you've upgraded to the Business Plan, click the "Add new Voice" button in the Text to Speech dropdown menu (#3 on the image above). You'll be prompted to upload an example of the speaker whose voice you want to clone. Note that customers MUST have the rights to clone a speaker's voice, as noted in Kapwing's terms of service.

Add Custom Clone Voice modal
If you click the "Create Voice Clone" option at this step, this modal will appear

To delete a voice clone, go to your Brand Kit and scroll down to the saved voice clones. Hover over a voice model icon and click the delete icon that appears in the upper corner.

What languages does Kapwing TTS support?

Kapwing uses same 30 different languages for text to speech as it does for dubbing. See the full list of supported languages below.

Supported Language List

English (US)
English (UK)
English (AUS)
Arabic (Multi-Region)
Chinese (Mandarin)
Czech
Danish
Dutch
Filipino (Tagolog)
French
German
Greek
Hebrew*
Hindi
Hungarian*
Indonesian
Italian
Japanese
Korean
Lithuanian*
Malay
Norwegian*
Polish
Portuguese (Brazil)
Portuguese (Portugal)
Romanian*
Russian
Slovak
Spanish (Spain)
Spanish (Mexico)
Swedish
Tamil
Ukrainian
Vietnamese*

* we do not support voice cloning in this language


Looking for more help?

Check our Release Notes for tutorials on how to use the latest Kapwing features or contact us.