-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TTS Hallucinations in shorter phrases #1695
Comments
Could you describe in detail how you tried it? Do you first generate
and then you invoke a second call to generate
or
? |
in the english example:
then tried 3 times the text:
i noticed that 2 of 3 times it adds "sir" or "si" after the "hello" ( "hello sir" or "hello si") meanwhile in the french exaple it adds stuff the first time! |
Can you reproduce it with our APK? I think there is a bug in your apk if what you described can be reproduced with your APK. |
you are right, it doesn't happen on your apk, the problem for me isn't just in the apk but even inside unity, using the code i shared earlier in the other thread. i am in need for french models in particular, the stuff they add at the end is not normal, and there are models that don't work at all on short sentences (they generate just distorted audio) like fr-FR_mls_medium.onnx. |
I just tried with your apk and I think there is a bug in your code. Please make sure you have overwritten the buffer for the previous call . Don't overwrite the buffer partially. |
Please don't use models containing I think I have deleted all models containing |
Or make sure you have cleared the buffer containing samples of the previous call before you play the samples of the current text. |
the buffer is cleared already:
also this is if we are talking about the apk, but in the french version it's different, would you like me to provide an apk for french as well? |
Could you describe the differences? Does the APK for French use a different set of code from the APK for English? |
no the same, just a different model, with different tokens file, what i mean by different, is the issue |
from the first time i generate an audio in french it hallucinates other stuff in the end of the text, so it's not a buffer issue for french, i just mentioned the english apk thinking it was related |
I don't see any issues from your posted code. |
Is each sentence processed sequentially, not in parallel? |
yes, sequentially, since the tts functions don't support streaming right now, it was the only option to make the generation faster |
No, we support passing a callback to C++. Inside C++, it processes the text sentence by sentence. After processing a sentence, the callback is invoked with the generated samples for this sentence. Please try our Android APK first. You will find it plays almost immediately no matter how long the given text is. Remeber to use the TTS APK, not the TTS Engine APK. |
Could you enable the debug in tts model config and post the logs when you generate samples? sherpa-onnx/sherpa-onnx/c-api/c-api.h Line 916 in 0cb2db3
|
i don't get any logs, that's the weird part, unity is not showing me any logs except the ones i made! am i doing something wrong?
|
IIRC, you posted some error logs in your first issue in the other session. How did you get them? |
from log cat that was in an apk using logcat, for some reason unity doesn't show the errors directly, hold tight, i will use log cat again |
it's sherpa that's logging that yellow raw text warning, but i am unable to get its stack trace |
Please show the code for |
good morning, thank you for your reply!
|
hello @csukuangfj, any solution yet? |
The code looks correct. Can you reproduce it with our example code in the dotnet-examples folder? |
sorry but i wasn't able to do that, i kinda lack experience of coding outside of unity |
i am running tts sherpa-onnx in unity (c#), i am having a problem where in the shorter sentences the generated audio tends to add extra audio containing gibberish at the end..
example long sentence (works fine) : "Bonjour monsieur, comment allez-vous aujourd’hui ? J’espère que vous passez une excellente journée !"
audio file: long sentence example
example short sentence (adds gibberish at the end): bonjour monsieur
audio file: short sentence example
in these examples i used umpc voice for french, but the same issues exists on other models.
for example on the libritts_r model when you generate "hello sir" it works, but when you generate "hello" immediately after it, it adds the previous text sometimes or part of it "hello sir" or "hello si".
The text was updated successfully, but these errors were encountered: