Clone Any AI Voice for FREE Locally in 1 Click! Create Custom Voices

Use AI to clone any voice locally in 1 click! Easily create custom voices from audio clips. Discover how to access thousands of pre-trained voice models and integrate them into your projects seamlessly.

September 7, 2024

party-gif

Discover the power of cloning any AI voice with just a few audio clips on your computer. Unlock endless possibilities, from Morgan Freeman reading you a bedtime story to Gordon Ramsay yelling insults as you cook dinner. This blog post will show you how to use the amazing open-source program RVC to create your own voice models and convert any audio into the voice of your choice, all for free and locally on your machine.

Easily Clone Any AI Voice for Free Using RVC

To install RVC, you have two options:

  1. One-Click Installer: If you are a Patreon supporter, you can download the one-click installer and simply double-click the file to install RVC.

  2. Manual Installation:

    • Ensure you have Python and Git for Windows installed.
    • Create a new folder on your computer and open the Command Prompt (CMD) in that folder.
    • Clone the RVC repository by running git clone <repository-link> in the CMD.
    • Determine your PyTorch version by running the provided command.
    • Create a new Python environment and activate it.
    • Install the required dependencies.
    • Download the necessary models and files.
    • Launch the go_webui.bat file to start the RVC web UI.

To clone a voice:

  1. In the "Train" tab, enter a name for your new voice clone and set the target sample rate.
  2. Provide the path to your training audio files (at least 10 minutes of high-quality audio).
  3. Configure the training settings, such as the number of training epochs, batch size, and save frequency.
  4. Click "One-Click Training" to start the training process.

Once the training is complete, you can use the cloned voice in the "Model Inference" tab. Adjust the transpose value to match the pitch of the source audio, select the path to the audio file you want to convert, and click "Convert" to generate the new audio with the cloned voice.

Alternatively, you can download pre-trained voice models from the community on websites like voicemodels.com and use them directly without the need for training.

To use text-to-speech with the cloned voice, you can leverage the Cooked TTS extension in the Text Generation web UI. Generate the initial audio using the Cooked TTS, then convert it to the cloned voice using RVC.

Remember, while RVC allows you to clone any voice, it's important to use this technology responsibly and ethically.

Manually Install RVC for Advanced Users

To manually install RVC, follow these steps:

  1. Make sure you have Python and Git for Windows installed on your computer.
  2. Create a new folder on your computer and name it as desired.
  3. Open the command prompt (CMD) by typing CMD in the folder path and pressing Enter.
  4. On the GitHub page, click on "Code", then click on the copy icon to copy the repository link.
  5. In the command prompt, type git clone and paste the copied link, then press Enter to clone the repository onto your computer.
  6. Navigate to the cloned folder by typing cd followed by the folder name and pressing Enter.
  7. Determine your PyTorch version by copying and pasting the command provided in the description and pressing Enter. Note the "CU" version, as you'll need it later.
  8. Create a new Python environment by typing python -m venv env and pressing Enter.
  9. Activate the environment using the command provided in the description, but make sure to replace "CU118" with the correct "CU" version you noted earlier.
  10. Install the requirements by running the provided command.
  11. If you encounter an error related to the NumPy module, uninstall it with pip uninstall numpy, then reinstall it with version 1.23.5.
  12. Download the models by running the command python tools/download_models.py.
  13. Download the ffmpeg.exe and ff.exe files from the provided link and place them in the main folder.
  14. Download the four launching files from the provided link and place them in the main folder, overwriting any existing files.
  15. Launch the go_webui.bat file to start the RVC web UI.

Now you're ready to start cloning voices using RVC!

Train Your Own Voice Model with RVC

To train your own voice model with RVC, follow these steps:

  1. Prepare your voice data:

    • You need at least 10 minutes of high-quality, clean audio recordings of your voice.
    • If you're cloning someone else's voice, download interview videos of them and isolate their voice using a tool like Audacity.
  2. Install RVC:

    • Use the one-click installer if you're a Patreon supporter, or follow the manual installation steps.
    • Make sure you have the correct CUDA version installed.
  3. Set up the training:

    • In the RVC web UI, go to the "Train" tab.
    • Enter a name for your new voice clone and set the target sample rate.
    • Specify the path to your voice data folder.
    • Select the appropriate training settings, such as the number of training epochs.
  4. Start the training:

    • Click "One Click Training" to begin the voice model training.
    • The training process can take around 1-1.5 hours, depending on the amount of data and your hardware.
  5. Use the trained model:

    • Once the training is complete, you can find the trained model files in the "Assets" and "Logs" folders.
    • In the "Model Inference" tab, select your trained model and adjust the transpose value to match the source audio.
    • Convert any audio file to your cloned voice by providing the audio file path and clicking "Convert".
  6. (Optional) Use pre-trained voice models:

    • Visit voicemodels.com to download pre-trained voice models created by the community.
    • Extract the model files and place them in the appropriate folders, then use them in the RVC web UI.

Remember, the quality of the final cloned voice depends on the quality and duration of the source audio data. Experiment with different settings and audio sources to achieve the best results.

Use Pre-Trained Voice Models with RVC

The RVC community has a huge collection of pre-trained voice models that you can download and use directly, without having to train your own model. To find these models, you can visit the website voicemodels.com.

On this website, you can search for any voice model you want, such as a specific character or celebrity. For example, if you want to use a SpongeBob voice model, you can simply click on the link to download the pre-trained archive.

Once you have the downloaded archive, you need to extract the two files it contains: a .pth file and an index file. The .pth file needs to be placed in the assets/wavs folder, and the index file needs to be placed in the logs folder.

After that, you can go back to the RVC web UI, click the "Refresh voice list" button, and then select the voice model you just added. You can then adjust the octave level as needed and click "Convert" to apply the voice model to your audio.

This process allows you to use pre-trained voice models without having to go through the entire training process yourself, making it much faster and easier to clone voices.

Combine RVC with Text-to-Speech for Seamless Conversions

To combine RVC with text-to-speech for seamless conversions, follow these steps:

  1. Use the Cooked TTS extension in the Text Generation WebUI to generate an initial audio file from the desired text.
  2. In the Cooked TTS extension, ensure the first message in the chat is the text you want to convert to audio.
  3. Once the audio file is generated, download it and use it as the input for the RVC conversion process.
  4. In the RVC web UI, select the voice model you want to use for the conversion.
  5. Adjust the pitch/transpose value as needed to match the target voice.
  6. Click "Convert" to generate the final audio file with the cloned voice.

This approach allows you to leverage the text-to-speech capabilities of the Text Generation WebUI to create the initial audio, and then use RVC to convert that audio to the desired cloned voice. This provides a seamless workflow for creating voice-cloned audio from text inputs.

Conclusion

In this comprehensive guide, we have explored the powerful capabilities of RVC (Real Voice Cloning), an open-source program that allows you to clone any voice and convert audio files into that new voice. We've covered the step-by-step process of installing RVC, both through the one-click installer and the manual installation method.

You've learned how to prepare high-quality audio samples, train your own voice model, and even leverage pre-trained models from the RVC community. The ability to clone voices opens up a world of possibilities, from having Morgan Freeman read you a bedtime story to having Gordon Ramsay yell insults while you cook dinner.

Additionally, we've discussed how to integrate RVC with text-to-speech tools, enabling you to generate audio with your cloned voice without the need for extensive audio recordings. This seamless integration allows for even more creative applications, such as role-playing in virtual environments.

Remember, while the capabilities of RVC are impressive, it's important to use this technology responsibly and ethically. Respect the privacy and rights of individuals, and avoid any malicious or deceptive uses of voice cloning.

Embrace the power of RVC, and let your creativity soar. The possibilities are endless, and the future of voice technology is in your hands.

FAQ