[PR #6064/0a5ac4aa backport][3.63] [SAT-29018] Fix/corrupted RA blocks content streaming #6161

pedro-psb · 2024-12-18T19:55:41Z

On a request for on-demand content in the content app, a corrupted Remote that contains the wrong binary (for that content) prevented other Remotes from being attempted on future requests.

Now the last failed Remotes are temporarily ignored and others may be picked.

Closes #5725

(cherry picked from commit 0a5ac4a)

pedro-psb · 2024-12-19T18:39:27Z

Can't backport because it contains migrations.

On a request for on-demand content in the content app, a corrupted Remote that contains the wrong binary (for that content) prevented other Remotes from being attempted on future requests. Now the last failed Remotes are temporarily ignored and others may be picked. Closes pulp#5725 (cherry picked from commit 0a5ac4a)

mdellweg · 2025-01-09T11:00:09Z

.github/workflows/scripts/script.sh

+# See pulpcore.app.util.ENABLE_6064_BACKPORT_WORKAROUND for context.
+# This needs to be set here because it relies on service init.
+# Its being tested in only one scenario to have both cases covered.
+if [[ "$TEST" == "s3" ]]; then
+    cmd_prefix pulpcore-manager backport-patch-6064
+fi
+


This file is managed by the plugin template.
Can you maybe do it is a post_before_script hook?

Oh, I've missed that. Yes, thanks

mdellweg · 2025-01-09T12:45:34Z

pulpcore/app/apps.py

+
+    if pulpcore.app.util.failed_at_exists(connection, RemoteArtifact):
+        pulpcore.app.util.ENABLE_6064_BACKPORT_WORKAROUND = True
+        RemoteArtifact.add_to_class("failed_at", models.DateTimeField(null=True))


What exactly does this do?

I don't know about the implementation, but the effect is like adding the field dynamically.
For example, django will be able to use the filter RemoteArtifact.objects.exclude(failed_at__gte=Y). If the field really exist in the database, it succeeds, otherwise it raises a ProgrammingError.

And something in this PR is altering the actual db table, so this can be used?
That feels like reinventing the whole db migrations framework with out the safeguards. A subsequent upgrade is then probably going to fail. When we said "You cannot backport a migration.", that meant you cannot add db altering code to a release branch assuming that all db alteration would be done by a migration in the django framework. My gut feeling is this is way too dangerous.
Can you think of a solution that does not require changing the db schema? We should be lucky by the fact that this is kind of ephemeral data.

Would it help to keep a per-worker list (maybe a bloom filter) in memory?

Can we repurpose another datetime field that we don't rely on there?

Can we use redis?

A subsequent upgrade is then probably going to fail.

My assumption was that a field addition (that doesn't have any other couplings) would be safe. But I can see this is a sensitive area. I'll explore those alternatives.

(I had though of per-worker cache, but concluded it would be simpler to use the db - before knowing about the backport problem).

Thanks for understanding my concerns.
At this point I think postgres may even reject to apply the migration on top of this out of bounds change.
If I could choose, i'd prefer the per worker in memory caching solution. Even if it would only solve the problem half way.

About the idea of repurposing another field, there is pulp_created and pulp_last_updated.
But I'm afraid of unexpected side-effects, like pulp_last_updated being updated by something and cooling down a good remote.
Or something else (thus, unexpected), because those are in the system for so long.

pulp_created | timestamp with time zone | | not null | pulp_last_updated | timestamp with time zone | | |

This is not about a Remote, but the RemoteArtifact, right? I'm not so concerned as this class is only used internally and never visible to the user. I highly doubt that we have any logic depending on it.

github-actions bot added the no-issue label Dec 18, 2024

pedro-psb closed this Dec 19, 2024

pedro-psb reopened this Jan 8, 2025

github-actions bot added the multi-commit label Jan 8, 2025

pedro-psb force-pushed the patchback/backports/3.63/0a5ac4aaf0f28b1055ee421a2ec35c65726039b0/pr-6064 branch from 43a4de6 to 2a44212 Compare January 8, 2025 19:17

Add workaround for applying backport from PR-6064 with django-command

414aaae

pedro-psb force-pushed the patchback/backports/3.63/0a5ac4aaf0f28b1055ee421a2ec35c65726039b0/pr-6064 branch from 2a44212 to 414aaae Compare January 8, 2025 19:29

mdellweg reviewed Jan 9, 2025

View reviewed changes

github-actions bot added the no-changelog label Jan 9, 2025

fixup: moved script to plugin hooks and fixed test

bb3f1a9

pedro-psb force-pushed the patchback/backports/3.63/0a5ac4aaf0f28b1055ee421a2ec35c65726039b0/pr-6064 branch from 3051539 to bb3f1a9 Compare January 9, 2025 13:06

pedro-psb mentioned this pull request Jan 10, 2025

[experiment] Backport migration workaround #6196

Draft

pedro-psb marked this pull request as draft January 10, 2025 21:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PR #6064/0a5ac4aa backport][3.63] [SAT-29018] Fix/corrupted RA blocks content streaming #6161

[PR #6064/0a5ac4aa backport][3.63] [SAT-29018] Fix/corrupted RA blocks content streaming #6161

pedro-psb commented Dec 18, 2024

pedro-psb commented Dec 19, 2024

mdellweg Jan 9, 2025

pedro-psb Jan 9, 2025

mdellweg Jan 9, 2025

pedro-psb Jan 9, 2025

mdellweg Jan 9, 2025

pedro-psb Jan 9, 2025

mdellweg Jan 9, 2025

pedro-psb Jan 10, 2025

mdellweg Jan 10, 2025

[PR #6064/0a5ac4aa backport][3.63] [SAT-29018] Fix/corrupted RA blocks content streaming #6161

Are you sure you want to change the base?

[PR #6064/0a5ac4aa backport][3.63] [SAT-29018] Fix/corrupted RA blocks content streaming #6161

Conversation

pedro-psb commented Dec 18, 2024

pedro-psb commented Dec 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment