BEGANSing + RVC + AudioSuperResolution

Korean Singing Voice Synthesis + Singing Voice Conversion(SVS + SVC)

The system generates singing voice from a given text and MIDI in an end-to-end manner.

Overview of the proposed system

Prepare Dataset

Inside the cloned folder, there is a folder called ./test_datasets. You can put the MIDI file and text file in it according to the format. MIDI and text should be arranged in the same number unconditionally. As an example, I will provide GFRIEND's "Rough" MIDI and text. And for the dataset to change the voice from the generated vocals, you can create a folder with the speaker's name in the ./datasets folder and put voice data for Retrieval Voice Conversion (RVC) in it. The following shows the ./datasets format.

BEGANSing
├────datasets
│       ├───kss
│       │   ├────1_0000.wav
│       │   ├────1_0001.wav
│       │   └────...
│       ├───{speaker_name}
│       │    ├───1.wav
└───────└────└───2.wav

This is just an example, and it's okay to add more speakers.

Preprocessing & Training

This pre-trained model is a model in which an additional 100 epochs was trained. For Preprocessing and Training, see Preprocessing, Training in the original repository.

Usage

python main.py {speaker_name} {song} {pitch_shift} --audiosr

If the speaker is male, it is recommended to set the {pitch_shift} value to -12, and if she is female, set it to 0.

The --audiosr option up-samples a voice generated at 22050hz to 48000hz. Use this option for those who have excellent graphics cards or don't mind taking a long time to generate a voice, or remove it if not.

Results

Audio samples at: https://soonbeomchoi.github.io/saebyulgan-blog/. Model was trained at RTX3090 24GB with batch size 32 for 2 days.

To-Do

Change Vocoder Griffin-Lim -> HiFi-GAN

References

g2p/korean_g2p.py from https://github.com/scarletcho/KoG2P
utils/midi_utils.py from Madmom, https://madmom.readthedocs.io/en/latest/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

BEGANSing + RVC + AudioSuperResolution

Korean Singing Voice Synthesis + Singing Voice Conversion(SVS + SVC)

Contents

Installation

Prepare Dataset

Preprocessing & Training

Usage

Results

To-Do

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

BEGANSing + RVC + AudioSuperResolution

Korean Singing Voice Synthesis + Singing Voice Conversion(SVS + SVC)

Contents

Installation

Prepare Dataset

Preprocessing & Training

Usage

Results

To-Do

References