Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assistant Archival Cleanup #1011

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

stephherbers
Copy link
Contributor

Description

990
Followup logic changes for assistant versioning archiving.

Now, when a assistant is archived that is a working version, all its versions is also archived and deleted from open AI.

Also from a UI standpoint. When it checks whether is can be archived, if its the working version it checks for all dependencies for all the versions not just the working version and displays everything that needs to be archived first. This way, the version won't be archived with the open AI deleted with something (exp or pipeline) still referencing it)

User Impact

Demo

Docs

@codecov-commenter
Copy link

codecov-commenter commented Dec 19, 2024

Codecov Report

Attention: Patch coverage is 97.47899% with 3 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
apps/assistants/views.py 25.00% 3 Missing ⚠️
Additional details and impacted files

📢 Thoughts on this report? Let us know!

version_query = None
if assistant.is_working_version:
version_query = list(
map(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but I don't think we have to map the ids to strings?

return self.experiment_set.filter(is_archived=False)

def get_related_pipeline_node_queryset(self):
def get_related_pipeline_node_queryset(self, query=None):
Copy link
Collaborator

@SmittieC SmittieC Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the query param is a list of ids, I suggest that we name and type it accordingly:

def get_related_pipeline_node_queryset(self, assistant_ids: list):

We could even pass in a queryset that returns the ids. We did something similar here for example. This wouldn't require parsing the query results to a list, which is nice, but up to you on which approach you'd like to take.

Also, looks like this parameter will come in useful to optimize the archive method. So instead of iterating through all assistant versions (in the case where we're archiving the working version), we can simply fetch the version ids and call this method with it. Same with get_related_experiments_queryset.

Copy link
Collaborator

@snopoke snopoke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing this and testing it locally I think we need to take a step back and look more holistically at how we trace references to objects and make that visible in the UI.

I've started a doc to make it easier to collaborate on the thoughts and ideas: https://docs.google.com/document/d/1Z09GNpO17izoVcRPGSmppNthFruyRJ0033T-RYvapoE/edit?tab=t.0

Comment on lines +95 to +96
experiment = ExperimentFactory(pipeline=pipeline)
experiment.assistant = v2_assistant
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't a valid state - and experiment should either have an assistant or a pipeline but not both

assert assistant.is_archived is True # archiving successful

@patch("apps.assistants.sync.push_assistant_to_openai", Mock())
def test_archive_versioned_assistant_with_still_exisiting_experiment_and_pipeline(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems to be the same as the previous one since the presence of the pipeline has no effect because it does't reference the assistant

pipeline = PipelineFactory()
NodeFactory(type="AssistantNode", pipeline=pipeline, params={"assistant_id": str(assistant.id)})
exp = ExperimentFactory(pipeline=pipeline)
exp.assistant = assistant
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you shouldn't set this property since you're testing the reference through the pipeline

Comment on lines +120 to +123
NodeFactory(type="AssistantNode", pipeline=v2_exp.pipeline, params={"assistant_id": str(v2_assistant.id)})
NodeFactory(type="AssistantNode", pipeline=v2_exp.pipeline, params={"assistant_id": str(v2_assistant.id)})
v2_exp.assistant = v2_assistant
v2_exp.save()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what's happening here. Creating a new version should take care of creating versioned nodes as well.

or version.get_related_pipeline_node_queryset().exists()
):
return False
for version in self.versions.all():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than re-iterating over the queryset you could accumulate the IDs in the previous loop

@stephherbers
Copy link
Contributor Author

noting that we are moving forward with changes from the doc so those commits will be coming in later and I will re-requests reviews

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants