You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model = HfApiModel(model_id=model_id, timeout=300)
When creating a new model, the timeout setting seems ineffective. When running the agent, I frequently get:
HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/Qwen/Qwen2.5-72B-Instruct/v1/chat/completions (Request ID: 21DsMRrush2AMn6er2XMD)
Model too busy, unable to get response in less than 120 second(s)```
The text was updated successfully, but these errors were encountered:
Since the class HfApiModel uses self.client = InferenceClient(self.model_id, token=token, timeout=timeout) under the hood, maybe this could be due to huggingface_hub.InferenceClient not respecting timeouts. Did you try calling directly the InferenceClient with the same Qwen model and a long message list?
Hello,
First of all, congrats for your work.
model = HfApiModel(model_id=model_id, timeout=300)
When creating a new model, the timeout setting seems ineffective. When running the agent, I frequently get:
The text was updated successfully, but these errors were encountered: