-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CodeActSWEAgent to remove browsing & github + improvements on agentskills #2105
Conversation
use minimal prompt for codeact;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most LGTM, left some question.
def get_in_context_example() -> str: | ||
return EXAMPLES | ||
|
||
|
||
class CodeActAgent(Agent): | ||
VERSION = '1.5' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether should we bump the version 1.6
? The prompts keep the same, but change some logic.
IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> or <execute_browse> instead of providing it. | ||
""" | ||
|
||
SWE_EXAMPLE = """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we use SWE_EXAMPLE
any where? Just according to this case, it seems spend lots of token on edit file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. SWE_EXAMPLE is used in the CodeActSWEAgent
, which does not use the same ICL for CodeActAgent
.
Co-authored-by: Yufan Song <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This use of SWE agent is interesting and, IMHO, most welcome. We just deprecated the original, a replacement like this seems to me like an even better solution for it.
You may also want to see #2103 CC: @li-boxuan
|
||
SYSTEM_SUFFIX = """The assistant's response should be concise. | ||
The assistant should include ONLY ONE <execute_ipython> or <execute_bash> or <execute_browse> in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. | ||
IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> or <execute_browse> instead of providing it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the <execute_browse>
relevant?
IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> or <execute_browse> instead of providing it. | |
IMPORTANT: Whenever possible, execute the code for the user using <execute_ipython> or <execute_bash> instead of providing it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch... It is not relevant. But following the same reason above, I have already run a couple of experiments with this prompt - so hoping to keep it as is for now.
How about we start a separate PR to correct these typos, and at the same time bump the CodeActSWEAgent's version from v1.5 to v1.6 for better reproducibility.
""" | ||
|
||
SYSTEM_SUFFIX = """The assistant's response should be concise. | ||
The assistant should include ONLY ONE <execute_ipython> or <execute_bash> or <execute_browse> in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assistant should include ONLY ONE <execute_ipython> or <execute_bash> or <execute_browse> in every one of the responses, unless the assistant is finished with the task or need more input or action from the user in order to proceed. | |
The assistant should include ONLY ONE <execute_ipython> or <execute_bash> in every one of the responses, unless the assistant is finished with the task or needs more input or action from the user in order to proceed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
return f'{action.thought}\n<execute_bash>\n{action.command}\n</execute_bash>' | ||
elif isinstance(action, IPythonRunCellAction): | ||
return f'{action.thought}\n<execute_ipython>\n{action.code}\n</execute_ipython>' | ||
elif isinstance(action, BrowseInteractiveAction): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly the BrowseInteractiveAction should be remove, is this the case or I simply misunderstood?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, this agent, CodeAct-SWE, is supposed to know how to browse and use github. It takes over those responsibilities, 'freeing' CodeAct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. I think this should be removed. CodeActSWE does not have the ability to browse.
Good catch... It is not relevant. But following the same reason above, I have already run a couple of experiments with this prompt - so hoping to keep it as is for now.
How about we start a separate PR to correct these typos, and at the same time bump the CodeActSWEAgent's version from v1.5 to v1.6 for better reproducibility.
^ For this reason, how about we starts a separate PR once this is merged to correct all these issues and bump the version?
len(message['content']) for message in messages | ||
) + len(action_str) | ||
|
||
if finish_command := re.search(r'<finish>.*</finish>', action_str, re.DOTALL): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this finish block in use? I cannot see instructions for the model to use <finish>...</finish>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! Let's remove it
Integration test captures a regression for CodeActAgent: UPDATE: I found the culprit. Looking into fixing it... |
@xingyaoww Looks like this is ready to go once conflicts are cleared. |
I spoke to Xingyao and we decided to merge it. |
No description provided.