Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I have used pipecat for several months, stability is always a problem #976

Closed
fatwang2 opened this issue Jan 13, 2025 · 9 comments
Closed

Comments

@fatwang2
Copy link

Description

Is this reporting a bug or feature request?
bug

If reporting a bug, please fill out the following:

Environment

  • pipecat-ai version:0.0.52
  • python version:3.12
  • OS:Docker

Issue description

I have developed a bot using the SimpleChatBot template. Initially, it was easy to set up and performed impressively. However, after deploying it into production, I began to encounter numerous bugs after approximately 10 minutes of usage, and they seem impossible to resolve.

Encountered Errors:

  1. Gemini: Received error code 1011 (internal error).
  2. ElevenLabs: Received error code 1008 (policy violation) or sent error code 1009 (message too big).
  3. Cartesia: Encountered error closing websocket: 'NoneType' object has no attribute 'finish'.

...and many more.

I understand that the demo you provided looks impressive, but how can we ensure stability when using it over an extended period?

Repro steps

List the steps to reproduce the issue.
Real long time usage

Expected behavior

Stable

Actual behavior

Not stable

Logs

As is shown in the description

@fatwang2
Copy link
Author

I have spent a significant amount of time in the Pipecat Discord community, and I have found that many users are experiencing the same issues as I am. I believe that Pipecat has the potential to deliver more than just a demo; however, I urge you to prioritize stability in your development efforts.

@markbackman
Copy link
Contributor

@fatwang2 I understand the frustration and we're committed to helping developers build and ship reliable, performant voice and multimodal apps.

It would be more helpful if you could break this one large issue into the specifics you've encountered. That would help to make things more actionable. For example, if you're getting 1008 and 1009 errors from ElevenLabs, we can't help on 1008 (policy violations)—you'll have to take that up with ElevenLabs on why your content was flagged. But, for the 1009 errors (message too big), we can definitely help.

Let's approach this from an engineering perspective. Since you've invested time in building something out adn you've learned a lot, can you please help to enumerate the issues you've encountered? From there, we can better understand the impediments that you've encountered and we can work on doing things that make Pipecat better.

@fatwang2
Copy link
Author

Thanks for the reply, as memotioned in the issue, all of those happened from time to time, not special words, just 10 mins talk

Gemini: Received error code 1011 (internal error).
ElevenLabs: Received error code 1008 (policy violation) or sent error code 1009 (message too big).
Cartesia: Encountered error closing websocket: 'NoneType' object has no attribute 'finish'.

@markbackman
Copy link
Contributor

markbackman commented Jan 14, 2025

Are those the only issues you see? In looking at our Daily Bots service, I do see errors like these logged. Here's more information.

Gemini

I see two types of 1011 (internal server errors). These are both Google errors that are out of our control to handle. They are:

  1. Internal Google resources exhausted:
gemini_live exception: received 1011 (internal error) Request trace id: a7393b0b090c3a69, [ORIGINAL ERROR] generic::resource_exhausted: com.google.cloud.privacy.dlp.common.excep; then sent 1011 (internal error) Request trace id: a7393b0b090c3a69, [ORIGINAL ERROR] generic::resource_exhausted: com.google.cloud.privacy.dlp.common.excep
  1. Rate limiting:
gemini_live exception: received 1011 (internal error) Request trace id: 6efe366d692c029f, [ORIGINAL ERROR] throttling::THROTTLED_CLIENT: Request throttled at the client by Adapt; then sent 1011 (internal error) Request trace id: 6efe366d692c029f, [ORIGINAL ERROR] throttling::THROTTLED_CLIENT: Request throttled at the client by Adapt

I've also seen instance of 429 rate limiting occurring.

Also, the Gemini Multimodal launch docs state a session limit (link):

Maximum session duration
Session duration is limited to up to 15 minutes for audio or up to 2 minutes of audio and video. When the session duration exceeds the limit, the connection is terminated.

The model is also limited by the context size. Sending large chunks of content alongside the video and audio streams may result in earlier session termination.

These are both out of our control. This is not surprising because Gemini Multimodal Live is still an experimental API. I would not recommend it for production yet, but am very excited when it's ready because it's so capable.

ElevenLabs

You flagged 1008 and 1009 errors.

  1. 1008 policy violation: I misspoke earlier. They are categorizing 1008 as service errors that are a result of your apps configuration being non-compliant with their service. In the Daily Bots logs, I see developers making this type of mistake:
elevenlabs exception: received 1008 (policy violation) A voice with voice_id SX63CSn3KLQKVC49nRLy does not exist.; then sent 1008 (policy violation) A voice with voice_id SX63CSn3KLQKVC49nRLy does not exist.

and this one:

elevenlabs exception: received 1008 (policy violation) Requested output format pcm_44100 (PCM at 44100hz) is only allowed for Pro tier and above.; then sent 1008 (policy violation) Requested output format pcm_44100 (PCM at 44100hz) is only allowed for Pro tier and above.

This is something you should be able to see in the console logs and correct. Check to make sure you have access to the voices and sample rates you're trying to use.

  1. 1009 error: message too big.
elevenlabs exception: sent 1009 (message too big); no close frame received

We're tracking it here: #983

Cartesia

I don't see any reference to finish() in the code. Are you attempting to call finish() to disconnect the websocket? Can you share repro steps for this specific one?


So, stepping back, I see:

  • Gemini: Two issues related to Gemini being experimental, out of our control. If this is an issue for your service, I would recommend using something that is not experimental.
  • ElevenLabs: One possible configuration issue in your code and one issue for the Pipecat team to improve.
  • Cartesia: An open question about where finish() is being called. It does not appear to be in the Pipecat code. Is this your code?

Given this, I would say that it's unfair to categorize Pipecat as unstable. Instead, I would encourage that we approach this as an engineering problem by breaking it into discrete problems and understanding and solving them. With that, we can get your app to be stable and performant.

Does this help take a step in that direction?

Pending your response on the Cartesia issue, we can either file an issue against Pipecat or fix an issue in your code. Either way, I'm going to leave this issue open until we figure out how to resolve that final question mark. Please feel free to share more, if there are other issues to surface.

@fatwang2
Copy link
Author

fatwang2 commented Jan 14, 2025

Sorry for the complaint about pipecat, what you make is really inspire me, and that is why I feel depressed after I put it into production.

Gemini

Nothing else we can do to improve it? 😂

Elevenlabs

I am not sure which configuration is wrong and 1008 (policy violation) happens from time to time, here is how I use it

        try:
            tts = ElevenLabsTTSService(
                api_key=os.getenv("ELEVENLABS_API_KEY"),
                voice_id="21m00Tcm4TlvDq8ikWAM",
                model="eleven_flash_v2_5",
                auto_mode=True,
                metrics=SentryMetrics(),
                params=ElevenLabsTTSService.InputParams(
                    stability=0.7,
                    similarity_boost=0.5,
                    use_speaker_boost=True,
                ),
                text_filter=MarkdownTextFilter(
                    params=MarkdownTextFilter.InputParams(
                        enable_text_filter=True,
                        filter_code=True,
                        filter_tables=True
                    )    
                ),
            )

Cartesia

I will update more information after 1 day usage

@markbackman
Copy link
Contributor

For Gemini, unfortunately not. We can't control the rate limits that Google applies or issues their service may encounter.

For ElevenLabs, I'm not sure. That looks fine to me. When you do get a 1008 error, please capture the contextual information and share it.

@markbackman
Copy link
Contributor

@fatwang2 I'm tracking the 1009 issue separately.

Did you figure out the Cartesia issue? If not, I'm planning to close out this issue since we have separate issues tracking the things that can be done.

@fatwang2
Copy link
Author

Cartesia TTS sometimes repeats words and the speed is very unstable, but I think this belongs to Cartesia, not Pipecat.

@markbackman
Copy link
Contributor

@fatwang2 yes, that's another service provider issue that we don't control. I'm going to close this issue out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants