You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current import process is entirely serial and runs record-by-record. When there’s a lot of downloading & archiving to do, that can make it pretty slow. Instead, we could make multiple passes that would let potentially do a lot of the work in parallel:
Find and add all new tags and maintainers (so we aren’t checking the DB for whether they exist on every record)
Group records by URL and capture time
Execute each of the groups above in parallel (or at least a sizable pool of threads). I’m reasonably sure nothing other than tags and maintainers should be shared across URLs right now.
Not high priority right now — even if our imports are slow, they are plenty fast enough for our current needs.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions.
The current import process is entirely serial and runs record-by-record. When there’s a lot of downloading & archiving to do, that can make it pretty slow. Instead, we could make multiple passes that would let potentially do a lot of the work in parallel:
Not high priority right now — even if our imports are slow, they are plenty fast enough for our current needs.
The text was updated successfully, but these errors were encountered: