Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cleaning up data from releases_json/release_assets #3013

Merged
merged 1 commit into from
Oct 16, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 26 additions & 10 deletions scripts/manage-db.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,21 +19,21 @@
sys.path.append(path.join(path.dirname(__file__), path.join("..", "vendor", "lib", "python")))


RELEASES_CLEANUP_CONDITION = """
LEFT JOIN rules rules_mapping ON (name=rules_mapping.mapping)
WHERE name LIKE '%%nightly%%'
AND name NOT LIKE '%%latest'
AND rules_mapping.mapping IS NULL
AND (STR_TO_DATE(RIGHT(name, 14), "%%Y%%m%%d%%H%%i%%S") < NOW() - INTERVAL {nightly_age} DAY);
"""


def cleanup_releases(trans, nightly_age, dryrun=True):
# This and the subsequent queries use "%%%%%" because we end up going
# through two levels of Python string formatting. The first is here,
# and the second happens at a low level of SQLAlchemy when the transaction
# is being executed.
query = (
"""
LEFT JOIN rules rules_mapping ON (name=rules_mapping.mapping)
WHERE name LIKE '%%%%nightly%%%%'
AND name NOT LIKE '%%%%latest'
AND rules_mapping.mapping IS NULL
AND (STR_TO_DATE(RIGHT(name, 14), "%%%%Y%%%%m%%%%d%%%%H%%%%i%%%%S") < NOW() - INTERVAL %s DAY);
"""
% nightly_age
)
query = RELEASES_CLEANUP_CONDITION.format(nightly_age=nightly_age)
if dryrun:
todelete = trans.execute("SELECT name FROM releases" + query).fetchall()
print("Releases rows to be deleted:")
Expand All @@ -45,6 +45,20 @@ def cleanup_releases(trans, nightly_age, dryrun=True):
trans.execute("DELETE releases FROM releases" + query)


def cleanup_releases_json(trans, nightly_age, dryrun=True):
query = RELEASES_CLEANUP_CONDITION.format(nightly_age=nightly_age)
bhearsum marked this conversation as resolved.
Show resolved Hide resolved
if dryrun:
todelete = trans.execute("SELECT name FROM releases_json" + query).fetchall()
print("Releases JSON rows to be deleted:")
if todelete:
print("\n".join(itertools.chain(*todelete)))
else:
print(" - None")
else:
trans.execute("DELETE releases_json FROM releases_json" + query)
trans.execute("DELETE release_assets FROM release_assets" + query)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DoesDELETE FROM table_name WHERE condition work? Instead of DELETE table_name FROM table_name WHERE condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, when you're joining in a delete you must include the table name that you want to delete from (https://www.dofactory.com/sql/delete-join).

(I had to look this up when I was writing this, heh.)

Copy link
Contributor

@jcristau jcristau Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's either DELETE tbl_name FROM tbl_name LEFT JOIN ... or DELETE FROM tbl_name USING tbl_name LEFT JOIN ..., when multiple tables are involved.
(https://dev.mysql.com/doc/refman/8.0/en/delete.html#idm45611665301776)

[Edit: link to mysql 8.0 doc instead of 5.7]



def chunk_list(list_object, n):
"""
Yield successive n-sized chunks from list_object.
Expand Down Expand Up @@ -203,5 +217,7 @@ def _strip_multiple_spaces(string):
with db.begin() as trans:
if action == "cleanup":
cleanup_releases(trans, nightly_age, dryrun=False)
cleanup_releases_json(trans, nightly_age, dryrun=False)
else:
cleanup_releases(trans, nightly_age, dryrun=True)
cleanup_releases_json(trans, nightly_age, dryrun=True)