TTS-RVC-API

Yes, we can use Coqui with RVC!

Why combine the two frameworks?

Coqui is a text-to-speech framework (vocoder and encoder), but cloning your own voice takes decades and offers no guarantee of better results. That's why we use RVC (Retrieval-Based Voice Conversion), which works only for speech-to-speech. You can train the model with just 2-3 minutes of dataset as it uses Hubert (a pre-trained model to fine-tune quickly and provide better results).

Installation

How to use Coqui + RVC api?

IMPORTANT: Make sure you are using Python 3.10 or lower.

Clone the repository

git clone https://github.com/adrien-dimitri/tts-rvc

(Recommended) Create and Activate a Virtual Environment:

On Windows:

python -m venv .venv
.venv\Scripts\activate

On macOS and Linux:

virtualenv <venv_name>
source <venv_name>/bin/activate

Install the required packages:

pip install -r requirements.txt
pip install TTS

Install the espeak backend for TTS:

Follow this guide on how to install the espeak backend for TTS: espeak backend Installation Guide

More information can be found here: espeak-ng
Install FFmpeg CLI:
- On Windows, follow this guide: FFmpeg Installation Guide
- On Linux, simply run the following command:
```
sudo apt-get install ffmpeg
```
More information can be found here: FFmpeg
Restart your PC.
Download the hubert model from one of the following links and place it in the models/hubert folder:

hubert_base.pt - Google Drive (recommended)

hubert_base.pt - Hugging Face
Download a pretrained RVC model from the following link and place it in a folder in the models/ folder:

RVC v2 model Archive

You will get the rvc_model.pth and rvc_model.index files which you need to place in the models/rvc_model_name folder.

The folder structure should look like this:
```
  /
└── models
      └── rvc_model_name
          ├── rvc_model.pth
          └── rvc_model.index
```
Run the server:
```
python -m uvicorn app.main:app
```
Run the api.py file:
```
python api.py
```

POST REQUEST

http://localhost:8000/generate

emotions : happy,sad,angry,dull
speed = 1.0 - 2.0

{
"speaker_name": "speaker3",
"input_text": "Hey there! Welcome to the world",
"emotion": "Surprise",
"speed": 1.0
}

CODE SNIPPET

import requests
import json
import time

url = "http://127.0.0.1:8000/generate"

payload = json.dumps({
  "speaker_name": "speaker3",
  "input_text": "Are you mad? The way you've betrayed me is beyond comprehension, a slap in the face that's left me boiling with an anger so intense it's as if you've thrown gasoline on a fire, utterly destroying any trust that was left.",
  "emotion": "Dull",
  "speed": 1.0
})
headers = {
  'Content-Type': 'application/json'
}

start_time = time.time()  # Start the timer

response = requests.request("POST", url, headers=headers, data=payload)

end_time = time.time()  # Stop the timer

if response.status_code == 200:
    audio_content = response.content
    
    # Save the audio to a file
    with open("generated_audio.wav", "wb") as audio_file:
        audio_file.write(audio_content)
        
    print("Audio saved successfully.")
    print("Time taken:", end_time - start_time, "seconds")
else:
    print("Error:", response.text)

Adjusting the TTS Parameters

To adjust the parameters of the TTS model, you need to locate the tts.py file which can be found at app/routers/tts.py in line 22.

Changing the TTS Model

Simply change the model_name parameter to the desired model name.

model_name	language	gender
tts_models/de/thorsten/tacotron2-DDC	de	male
tts_models/de/css10/vits-neon	de	female
tts_models/en/ljspeech/vits	en	female
tts_models/en/sam/tacotron-DDC	en	male

Activating CUDA

To activate CUDA, change the gpu parameter to True.

Example:

The file should look like this:

# Initialize TTS
tts = TTS(model_name="tts_models/en/ljspeech/vits", progress_bar=True, gpu=False)

Change it to:

# Initialize TTS
tts = TTS(model_name="tts_models/en/ljspeech/vits", progress_bar=True, gpu=True)

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
app		app
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api.py		api.py
build.sh		build.sh
config.toml		config.toml
requirements.txt		requirements.txt
trainset_preprocess_pipeline_print.py		trainset_preprocess_pipeline_print.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TTS-RVC-API

Why combine the two frameworks?

Installation

POST REQUEST

CODE SNIPPET

Adjusting the TTS Parameters

Changing the TTS Model

Activating CUDA

Example:

About

Releases

Packages

Contributors 3

Languages

License

adrien-dimitri/tts-rvc

Folders and files

Latest commit

History

Repository files navigation

TTS-RVC-API

Why combine the two frameworks?

Installation

POST REQUEST

CODE SNIPPET

Adjusting the TTS Parameters

Changing the TTS Model

Activating CUDA

Example:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages