Implement missing speed functions along with durable speech rate / speed changer function. #239

isikhi · 2024-12-29T21:30:12Z

As I can see coqui tts repo contuniues on this repo. So i want to add same pr into this one. Thanks! Here is the details:

Added missing speed parameters to functions and ensured more durable, accurate speed adjustments with the new adjust_speech_rate function.

Base Repo Ref: coqui-ai#4115

…e durable latents. also missed tts speed implementations added.

eginhard

Thank you for the PR and sorry for the slow response! I would suggest a slightly different approach to support setting the speed for XTTS. Coqui has a bunch of different models where speed can be modified in different ways and for now I don't want to commit to a single method in api.py.

What will also work is to remove any reference to speed from api.py. If a speed argument is then specified by the user, it will then be automatically passed to models that can handle it via the **kwargs. Would you like to make these changes? Otherwise I can take care of it at some point.

isikhi · 2025-01-19T13:24:41Z

@eginhard thank you for the feedback! i am not entirely familiar with all the Coqui TTS models, but I guess I understand your suggestion. But i want to clarify again. are you recommending that instead of passing speed directly, I should rely on **kwargs to handle it dynamically?

For example, would this be closer to what you have in mind?

wav = self.tts(  
    text=text,  
    speaker=speaker,  
    language=language,  
    speaker_wav=speaker_wav,  
    split_sentences=split_sentences,  
    **kwargs,  # `speed` would be included here if specified by the user  and check in the body of function
)

i guess with this way, any user-specified speed would automatically propagate to models that support it, without hardcoding it in api.py. please let me know if this aligns with your comment or if there’s another approach you’d prefer.

i am not advanced in python and coqui but i am happy to make further adjustments if i can do with your guidence.

eginhard · 2025-01-19T14:51:58Z

Exactly. What's currently blocking speed to be passed through the kwargs is that some functions in api.py have it as an explicit argument (although it's not actually used then), so after removing those it should just work.

isikhi · 2025-01-20T19:29:47Z

I was going to change and remove them from directly but then i realize for users whose version numbers are not strictly fixed, systems that call any function containing speed and use it in its current form might encounter issues (since there are no private accessors defined, making every function accessible). To avoid a BC (backward compatibility) break, I believe it would be more appropriate to leave speed as it is. However, if a BC break is not a concern and we are planning to release a new major version, I can proceed with the changes.

eginhard · 2025-01-21T11:36:08Z

Well before you couldn't set speed and after that change you can, so that is an improvement and doesn't really break anything :)

isikhi and others added 2 commits December 28, 2024 23:08

feat: add adjust_speech_rate function to modify speech speed with mor…

26128be

…e durable latents. also missed tts speed implementations added.

Merge branch 'dev' into fix-improvements/adjust-speech-rate-or-speed

ed1563b

isikhi mentioned this pull request Dec 29, 2024

Implement missing speed functions along with durable speech rate / speed changer function. coqui-ai/TTS#4115

Closed

eginhard requested changes Jan 15, 2025

View reviewed changes

Merge branch 'dev' into fix-improvements/adjust-speech-rate-or-speed

c868390

update: speed from kwargs at synthesize

e6084ae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement missing speed functions along with durable speech rate / speed changer function. #239

Implement missing speed functions along with durable speech rate / speed changer function. #239

isikhi commented Dec 29, 2024

eginhard left a comment

isikhi commented Jan 19, 2025

eginhard commented Jan 19, 2025

isikhi commented Jan 20, 2025

eginhard commented Jan 21, 2025

Implement missing speed functions along with durable speech rate / speed changer function. #239

Are you sure you want to change the base?

Implement missing speed functions along with durable speech rate / speed changer function. #239

Conversation

isikhi commented Dec 29, 2024

eginhard left a comment

Choose a reason for hiding this comment

isikhi commented Jan 19, 2025

eginhard commented Jan 19, 2025

isikhi commented Jan 20, 2025

eginhard commented Jan 21, 2025