-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add memory to load calc #1336
base: main
Are you sure you want to change the base?
Add memory to load calc #1336
Conversation
🦋 Changeset detectedLatest commit: c29ed22 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
PTAL @davidzhao @theomonnom |
|
||
@classmethod | ||
def get_load(cls, worker: Worker) -> float: | ||
if cls._instance is None: | ||
cls._instance = _DefaultLoadCalc() | ||
|
||
return cls._instance._m_avg.get_avg() | ||
return max( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could be more clever here (e.g. if we wanted different thresholds for memory and CPU usage)
with open("/sys/fs/cgroup/memory.max", "r") as f: | ||
max_memory = f.read().strip() | ||
if max_memory == "max": | ||
return psutil.virtual_memory().total |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense, alternatively we can just treat it as unlimited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding this!
I think the memory monitor should work a bit differently compared to CPU due to the type of resources.
for memory, I think it's rather binary: as long as we have x% or xMB of memory free, we should be able to take jobs. for CPU, we'd like to stop earlier because jobs could have bursty usage patterns, so leaving plenty of headroom (stopping to take jobs at 65% utilization) would reduce the likelihood that we'll get throttled by CPU
Got it, I'll take a pass at that |
With the new turn detector, our processes are now memory bound - so incorporate that in the check.