Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(scap,pman): add new per-CPU driver metrics #1998

Merged
merged 2 commits into from
Aug 19, 2024

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area libscap-engine-bpf

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap

/area libpman

/area tests

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

This PR introduces new per-CPU stats for our drivers. When collecting some metrics about drops in our drivers, it could be useful to understand where we are dropping. It could be useful to know how drops and events are distributed between our CPUs, whether it is just one CPU under pressure or if the whole system is having a hard time.

This is an example of the output on 8 CPUs with scap-open

[1] n_evts: 88939
[1] n_drops: 0
[1] n_evts_cpu_0: 10326
[1] n_drops_cpu_0: 0
[1] n_evts_cpu_1: 12517
[1] n_drops_cpu_1: 0
[1] n_evts_cpu_2: 11418
[1] n_drops_cpu_2: 0
[1] n_evts_cpu_3: 11937
[1] n_drops_cpu_3: 0
[1] n_evts_cpu_4: 10960
[1] n_drops_cpu_4: 0
[1] n_evts_cpu_5: 6675
[1] n_drops_cpu_5: 0
[1] n_evts_cpu_6: 11195
[1] n_drops_cpu_6: 0
[1] n_evts_cpu_7: 13911
[1] n_drops_cpu_7: 0

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copy link

github-actions bot commented Aug 7, 2024

Please double check driver/API_VERSION file. See versioning.

/hold

@Andreagit97
Copy link
Member Author

While i was there i tried to uniform the metric collection among drivers at scap level

Copy link

github-actions bot commented Aug 7, 2024

Perf diff from master - unit tests

     1.94%     -1.10%  [.] sinsp_evt::get_ts
     1.27%     +1.04%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     4.91%     -0.88%  [.] sinsp_parser::process_event
     5.86%     -0.80%  [.] next
     0.05%     +0.75%  [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, libsinsp::state::dynamic_struct::field_info>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, libsinsp::state::dynamic_struct::field_info> >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_find_before_node
     3.15%     -0.54%  [.] sinsp_thread_manager::get_thread_ref
     5.85%     -0.53%  [.] sinsp_evt::get_type
     1.82%     -0.44%  [.] sinsp::fetch_next_event
     0.91%     +0.44%  [.] 0x00000000000e8380
     0.72%     -0.40%  [.] sinsp_parser::parse_clone_exit_child

Perf diff from master - scap file

    15.02%     -7.55%  [.] sinsp_filter_check::extract_nocache
    12.52%     -4.48%  [.] sinsp_evt_formatter::tostring_withformat
     7.32%     -3.65%  [.] sinsp_filter_check_thread::extract_single
     7.33%     -3.65%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     2.57%     +3.63%  [.] sinsp_parser::reset
     2.56%     +2.32%  [.] rawstring_check::extract_single
     2.56%     +2.28%  [.] main
     5.08%     -2.11%  [.] sinsp_evt::load_params
     5.01%     -2.06%  [.] sinsp_filter_check::tostring
     5.01%     -2.06%  [.] sinsp_evt::get_type

Heap diff from master - unit tests

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Copy link

codecov bot commented Aug 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.08%. Comparing base (5fa87bb) to head (f1d52cc).
Report is 13 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1998   +/-   ##
=======================================
  Coverage   74.08%   74.08%           
=======================================
  Files         253      253           
  Lines       30766    30766           
  Branches     5395     5388    -7     
=======================================
+ Hits        22793    22794    +1     
+ Misses       7949     7944    -5     
- Partials       24       28    +4     
Flag Coverage Δ
libsinsp 74.08% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@incertum
Copy link
Contributor

incertum commented Aug 7, 2024

This is very nice @Andreagit97 🚀 . Looking forward to gathering more insights into issues due to possible bursts of events that lead to higher drops. I would expect higher spikes of drops on a subset of CPUs aka not a uniform distribution in such cases.

@@ -48,18 +47,6 @@ static void pman_save_attached_progs()
g_state.attached_progs_fds[7] = bpf_program__fd(g_state.skel->progs.pf_kernel);
#endif
g_state.attached_progs_fds[8] = bpf_program__fd(g_state.skel->progs.signal_deliver);

for(int j = 0; j < MODERN_BPF_PROG_ATTACHED_MAX; j++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andreagit97 mind getting me up to speed wrt the reason for changing the logic above to

for(int j = 0; j < MODERN_BPF_PROG_ATTACHED_MAX; j++)
{
      g_state.attached_progs_fds[j] = -1;
}

and below we have

for(int j = 0; j < MODERN_BPF_PROG_ATTACHED_MAX; j++)
{
	if(g_state.attached_progs_fds[j] != -1)
	{
		nprogs_attached++;
	}
}

Besides this question, LGTM!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ei! The idea here was to move all the logic inside pman_get_metrics_v2 in this way we could avoid creating a global variable g_state.n_attached_progs and just use a local variable. In the end, the only place where we need this information regarding attached_progs is inside pman_get_metrics_v2. Now the modern ebpf and the legacy one do the same loop in the same place so it should be easily to maintain in the future

Copy link
Contributor

@incertum incertum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Aug 8, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, incertum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Andreagit97,incertum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana
Copy link
Contributor

poiana commented Aug 8, 2024

LGTM label has been added.

Git tree hash: 16dee91caa9c259317c0279508cc70080cf0afed

@incertum
Copy link
Contributor

incertum commented Aug 8, 2024

/milestone 0.18.0

@poiana poiana added this to the 0.18.0 milestone Aug 8, 2024
@poiana poiana merged commit 18de8ce into falcosecurity:master Aug 19, 2024
46 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants