model = HfApiModel(model_id=model_id, timeout=300) - Timeout parameter seems ineffective #61

joaopauloschuler · 2025-01-04T06:00:23Z

Hello,
First of all, congrats for your work.

model = HfApiModel(model_id=model_id, timeout=300)

When creating a new model, the timeout setting seems ineffective. When running the agent, I frequently get:

HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/Qwen/Qwen2.5-72B-Instruct/v1/chat/completions (Request ID: 21DsMRrush2AMn6er2XMD)

Model too busy, unable to get response in less than 120 second(s)```

The text was updated successfully, but these errors were encountered:

aymeric-roucher · 2025-01-06T12:33:44Z

Since the class HfApiModel uses self.client = InferenceClient(self.model_id, token=token, timeout=timeout) under the hood, maybe this could be due to huggingface_hub.InferenceClient not respecting timeouts. Did you try calling directly the InferenceClient with the same Qwen model and a long message list?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model = HfApiModel(model_id=model_id, timeout=300) - Timeout parameter seems ineffective #61

model = HfApiModel(model_id=model_id, timeout=300) - Timeout parameter seems ineffective #61

joaopauloschuler commented Jan 4, 2025

aymeric-roucher commented Jan 6, 2025

model = HfApiModel(model_id=model_id, timeout=300) - Timeout parameter seems ineffective #61

model = HfApiModel(model_id=model_id, timeout=300) - Timeout parameter seems ineffective #61

Comments

joaopauloschuler commented Jan 4, 2025

aymeric-roucher commented Jan 6, 2025