-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new(scap,pman): add new per-CPU driver metrics #1998
Conversation
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
Please double check driver/API_VERSION file. See versioning. /hold |
While i was there i tried to uniform the metric collection among drivers at scap level |
Perf diff from master - unit tests
Perf diff from master - scap file
Heap diff from master - unit tests
Heap diff from master - scap file
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1998 +/- ##
=======================================
Coverage 74.08% 74.08%
=======================================
Files 253 253
Lines 30766 30766
Branches 5395 5388 -7
=======================================
+ Hits 22793 22794 +1
+ Misses 7949 7944 -5
- Partials 24 28 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This is very nice @Andreagit97 🚀 . Looking forward to gathering more insights into issues due to possible bursts of events that lead to higher drops. I would expect higher spikes of drops on a subset of CPUs aka not a uniform distribution in such cases. |
@@ -48,18 +47,6 @@ static void pman_save_attached_progs() | |||
g_state.attached_progs_fds[7] = bpf_program__fd(g_state.skel->progs.pf_kernel); | |||
#endif | |||
g_state.attached_progs_fds[8] = bpf_program__fd(g_state.skel->progs.signal_deliver); | |||
|
|||
for(int j = 0; j < MODERN_BPF_PROG_ATTACHED_MAX; j++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Andreagit97 mind getting me up to speed wrt the reason for changing the logic above to
for(int j = 0; j < MODERN_BPF_PROG_ATTACHED_MAX; j++)
{
g_state.attached_progs_fds[j] = -1;
}
and below we have
for(int j = 0; j < MODERN_BPF_PROG_ATTACHED_MAX; j++)
{
if(g_state.attached_progs_fds[j] != -1)
{
nprogs_attached++;
}
}
Besides this question, LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ei! The idea here was to move all the logic inside pman_get_metrics_v2
in this way we could avoid creating a global variable g_state.n_attached_progs
and just use a local variable. In the end, the only place where we need this information regarding attached_progs is inside pman_get_metrics_v2
. Now the modern ebpf and the legacy one do the same loop in the same place so it should be easily to maintain in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Andreagit97, incertum The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
LGTM label has been added. Git tree hash: 16dee91caa9c259317c0279508cc70080cf0afed
|
/milestone 0.18.0 |
What type of PR is this?
/kind feature
Any specific area of the project related to this PR?
/area libscap-engine-bpf
/area libscap-engine-kmod
/area libscap-engine-modern-bpf
/area libscap
/area libpman
/area tests
Does this PR require a change in the driver versions?
No
What this PR does / why we need it:
This PR introduces new per-CPU stats for our drivers. When collecting some metrics about drops in our drivers, it could be useful to understand where we are dropping. It could be useful to know how drops and events are distributed between our CPUs, whether it is just one CPU under pressure or if the whole system is having a hard time.
This is an example of the output on 8 CPUs with
scap-open
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: