Creating a Wav2Vec2 model for PT-Br
- Mozilla Common Voice - PT-Br:
- 134 hours of validated audio and transcriptions
- 2.453 voices
The idea was to use a Wav2Vec2 model and finetune it.
All the steps will be placed on the notebook, to achieve a Wer -> 0.0141 others technincs were used. The model, results and the API can be found on the HuggingFaceREPO - Victor
On the HF-Repo you can try by yourself loading a pre-recorded audio or using the Realtime Speech Recognition! Have Fun :D