[Bug]: Infinite Loop and Timeout Issue in SWE-Bench Evaluation Due to Context Overflow Handling in OpenHands Framework #6357

yanrui27 · 2025-01-20T03:37:16Z

Is there an existing issue for the same bug?

I have checked the existing issues.

Describe the bug and reproduction steps

Description:

While using the OpenHands framework to evaluate SWE-Bench, I encountered an issue where the program enters an infinite loop and eventually times out when the conversation context exceeds the model’s maximum window size.

Problem:

This issue arises from a previously introduced fix that attempts to truncate the agent's history when the context window exceeds its limit. The fix proposed truncating the history roughly in half. However, the problem is that this approach doesn't correctly manage the agent's internal state after truncation, which causes the agent to enter an infinite loop. This loop persists until the program times out.

Steps to Reproduce:

Use the OpenHands framework to evaluate SWE-Bench.
Provide a conversation history that exceeds the model’s context window limit.
The framework attempts to truncate the history as per the proposed fix.
After truncation, the agent enters an infinite loop due to improper state handling, eventually causing a timeout.

Expected Behavior:

When the context window is exceeded, the conversation history should be truncated as expected.
The agent’s state should be properly reset or adjusted after truncating the history to avoid entering an infinite loop.
The program should terminate correctly without triggering a timeout.

Additional Information:

This issue is specific to the SWE-Bench evaluation and does not appear in all use cases.
The previous fix (which truncates history) does not properly reset the agent's internal state, leading to the infinite loop.
A timeout occurs because the system continues looping endlessly without terminating.

Links to Relevant Issues:

Related to [Issue Context Window Exceeded fix #4977], which addresses context window overflow errors but doesn't resolve the infinite loop problem.
Reference to an [error encountered with context overflow in another case].

Suggested Fix:

Ensure that after truncating the conversation history, the agent's internal state is properly reset to prevent the infinite loop.
Consider handling agent states more robustly after truncation, such as saving and restoring the state correctly, to avoid this issue.

OpenHands Installation

Development workflow

OpenHands Version

0.20.0

Operating System

Linux

Logs, Errors, Screenshots, and Additional Context

No response

yanrui27 · 2025-01-20T04:13:09Z

Proposed Solution

To address the issue of infinite loops caused by context window exceeding errors, we suggest adding proper exception handling and state management when exceptions are raised. Specifically, after handling the exception, we need to ensure the agent’s state is properly updated to prevent the loop from continuing.

The following solution outlines the steps:

Catch the exception: After catching the exception, call the await self._react_to_exception(e) method to handle the exception as currently implemented.

Code Example

OpenHands/openhands/controller/agent_controller.py

Lines 693 to 698 in 1b6e444

    
               # Save the ID of the first event in our truncated history for future reloading 
        
               if self.state.history: 
        
                   self.state.start_id = self.state.history[0].id 
        
               # Don't add error event - let the agent retry with reduced context 
        
               return 
        
           raise

                    # Save the ID of the first event in our truncated history for future reloading
                    if self.state.history:
                        self.state.start_id = self.state.history[0].id
                    # Don't add error event - let the agent retry with reduced context

                    await self._react_to_exception(e) # <-- Adding state handling here to prevent infinite loop
                    return
                raise
`

enyst · 2025-01-20T04:27:52Z

What exactly of the agent state do you think needs modified when context window error happens? Which attributes?

I'm not sure what, in the agent state, could lead to an infinite loop. Maybe the missing information was relevant? But then resetting state wouldn't fix that.

The proposed solution, to deal with the exception, puts the agent in ERROR state (which is not necessary IMO, and I think would end the swe_bench run right there), and it calls reset, which would zero out the metrics, leading to inaccurate reporting at the end of the run.

Can you tell how exactly did you run, to get this issue? LLM, number of iterations, instance_id, would be useful to replicate. The full run_infer command would be great. Any logs you may have would be useful too!

yanrui27 added the bug Something isn't working label Jan 20, 2025

yanrui27 closed this as completed Jan 20, 2025

yanrui27 reopened this Jan 20, 2025

kevin-support-bot bot mentioned this issue Jan 20, 2025

[Bug]: Infinite Loop and Timeout Issue in SWE-Bench Evaluation Due to Context Overflow Handling in OpenHands Framework SmartManoj/Kevin#221

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Infinite Loop and Timeout Issue in SWE-Bench Evaluation Due to Context Overflow Handling in OpenHands Framework #6357

[Bug]: Infinite Loop and Timeout Issue in SWE-Bench Evaluation Due to Context Overflow Handling in OpenHands Framework #6357

yanrui27 commented Jan 20, 2025 •

edited

Loading

yanrui27 commented Jan 20, 2025 •

edited

Loading

enyst commented Jan 20, 2025 •

edited

Loading

[Bug]: Infinite Loop and Timeout Issue in SWE-Bench Evaluation Due to Context Overflow Handling in OpenHands Framework #6357

[Bug]: Infinite Loop and Timeout Issue in SWE-Bench Evaluation Due to Context Overflow Handling in OpenHands Framework #6357

Comments

yanrui27 commented Jan 20, 2025 • edited Loading

Is there an existing issue for the same bug?

Describe the bug and reproduction steps

Description:

Problem:

Steps to Reproduce:

Expected Behavior:

Additional Information:

Links to Relevant Issues:

Suggested Fix:

OpenHands Installation

OpenHands Version

Operating System

Logs, Errors, Screenshots, and Additional Context

yanrui27 commented Jan 20, 2025 • edited Loading

Proposed Solution

Code Example

enyst commented Jan 20, 2025 • edited Loading

yanrui27 commented Jan 20, 2025 •

edited

Loading

yanrui27 commented Jan 20, 2025 •

edited

Loading

enyst commented Jan 20, 2025 •

edited

Loading