Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(modern_ebpf): address verifier issues on kernel versions >=6.12.0 #2172

Merged
merged 5 commits into from
Nov 26, 2024

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind bug

Any specific area of the project related to this PR?

/area driver-modern-bpf

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

With the new kernel version 6.12.0 something in the tail call management is changed again. In 6.11.y it was possible to share tail calls between sys_enter and sys_exit programs, starting from 6.12-rc1 this is no longer possible.

Using bpftrace we can see that the attach_proto of the 2 programs was identical in 6.11.y but is now changed in

sudo /usr/local/bin/bpftrace -e 'fentry:bpf_prog_map_compatible /comm == "main" / { printf("map_attach_proto: %p, func attach proto: %p\n", args->map->owner.attach_func_proto, args->fp->aux->attach_func_proto ); }'

6.11.10

map_attach_proto: 0xffffffff885e3170, func attach proto: 0xffffffff885e3170
map_attach_proto: 0xffffffff885e3170, func attach proto: 0xffffffff885e3170

6.12.1

map_attach_proto: 0xffffffffb04e16e4, func attach proto: 0xffffffffb04e1714
map_attach_proto: 0xffffffffb04e16e4, func attach proto: 0xffffffffb04e1714

It is not clear if this is intentional or just a consequence of some other patches, I need to dig more into this. BTW on our side, the fix is quite simple -> avoid tail table shared among ebpf programs with different attach_proto.

This patch moves HOTPLUG DROP_E DROP_X tail calls only in the sys_exit flow and dedicates a tail table to them.
The hotplug was already managed only in exit after #2150
The only functional change of this PR is that DROP_E and DROP_X are sent only by sys_exit and not by sys_enter but this should change nothing for the userspace.

To easily recognize this issue, this is the verifier error:

2024-11-25 08:57:00.799, 1800.1769, Warning, libbpf: prog 'sys_exit': BPF program load failed: Invalid argument
2024-11-25 08:57:00.799, 1800.1769, Warning, libbpf: prog 'sys_exit': -- BEGIN PROG LOAD LOG --
processed 590 insns (limit 1000000) max_states_per_insn 5 total_states 57 peak_states 49 mark_read 10
-- END PROG LOAD LOG --
2024-11-25 08:57:00.799, 1800.1769, Warning, libbpf: prog 'sys_exit': failed to load: -22
2024-11-25 08:57:00.801, 1800.1769, Warning, libbpf: failed to load object 'bpf_probe'
2024-11-25 08:57:00.802, 1800.1769, Warning, libbpf: failed to load BPF skeleton 'bpf_probe': -22

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Copy link

github-actions bot commented Nov 26, 2024

Please double check driver/API_VERSION file. See versioning.

/hold

@Andreagit97
Copy link
Member Author

Please note: all kernel versions >=6.12.0 are affected so the modern ebpf won't work on them without this fix

Copy link

github-actions bot commented Nov 26, 2024

Perf diff from master - unit tests

     4.74%     +1.17%  [.] sinsp_evt::get_type
     7.80%     -0.89%  [.] sinsp::next
     4.63%     -0.79%  [.] next
     9.43%     +0.77%  [.] sinsp_parser::reset
     3.48%     -0.68%  [.] sinsp_thread_manager::get_thread_ref
     1.69%     +0.53%  [.] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release
     1.04%     +0.49%  [.] libsinsp::sinsp_suppress::process_event
     0.57%     +0.40%  [.] scap_event_has_large_payload
     0.62%     +0.40%  [.] sinsp_threadinfo::~sinsp_threadinfo
     0.57%     -0.39%  [.] sinsp_parser::parse_open_openat_creat_exit

Heap diff from master - unit tests

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            -0.0256         -0.0256           149           145           149           145
BM_sinsp_split_median                                          -0.0233         -0.0233           149           146           149           146
BM_sinsp_split_stddev                                          +0.4602         +0.4593             1             1             1             1
BM_sinsp_split_cv                                              +0.4985         +0.4976             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  -0.0362         -0.0362            56            54            56            54
BM_sinsp_concatenate_paths_relative_path_median                -0.0558         -0.0558            55            52            55            52
BM_sinsp_concatenate_paths_relative_path_stddev                +3.4681         +3.4723             0             2             0             2
BM_sinsp_concatenate_paths_relative_path_cv                    +3.6357         +3.6401             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     +0.0452         +0.0452            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_median                   +0.0415         +0.0415            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_stddev                   +3.0975         +3.1028             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       +2.9203         +2.9254             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  -0.0197         -0.0197            57            56            57            56
BM_sinsp_concatenate_paths_absolute_path_median                -0.0267         -0.0267            57            55            57            55
BM_sinsp_concatenate_paths_absolute_path_stddev                +1.2721         +1.2716             1             1             1             1
BM_sinsp_concatenate_paths_absolute_path_cv                    +1.3177         +1.3172             0             0             0             0
BM_sinsp_split_container_image_mean                            -0.0445         -0.0445           392           375           392           375
BM_sinsp_split_container_image_median                          -0.0445         -0.0445           392           375           392           375
BM_sinsp_split_container_image_stddev                          -0.3090         -0.3098             3             2             3             2
BM_sinsp_split_container_image_cv                              -0.2768         -0.2777             0             0             0             0

Copy link

codecov bot commented Nov 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.82%. Comparing base (55ff79f) to head (fefd513).
Report is 7 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2172   +/-   ##
=======================================
  Coverage   74.82%   74.82%           
=======================================
  Files         254      254           
  Lines       33510    33510           
  Branches     5746     5745    -1     
=======================================
+ Hits        25073    25074    +1     
+ Misses       8437     8436    -1     
Flag Coverage Δ
libsinsp 74.82% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Andreagit97
Copy link
Member Author

Andreagit97 commented Nov 26, 2024

This PR revealed that build-libs-linux-amd64-static 🎃 job uses a bpftool version with a too-old libbpf version, we will try to bump it.

2024-11-26T13:13:08.4791400Z [ 59%] [MODERN BPF] Building BPF skeleton: /__w/libs/libs/build/skel_dir/bpf_probe.skel.h
2024-11-26T13:13:08.4908897Z libbpf: map 'custom_sys_exit_calls': should be map-in-map.

libbpf/libbpf@f327194

Copy link

github-actions bot commented Nov 26, 2024

X64 kernel testing matrix

KERNEL CMAKE-CONFIGURE KMOD BUILD KMOD SCAP-OPEN BPF-PROBE BUILD BPF-PROBE SCAP-OPEN MODERN-BPF SCAP-OPEN
amazonlinux2-4.19 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2-5.10 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2-5.15 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2-5.4 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2022-5.15 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2023-6.1 🟢 🟢 🟢 🟢 🟢 🟢
archlinux-6.0 🟢 🟢 🟢 🟢 🟢 🟢
archlinux-6.7 🟢 🟢 🟢 🟢 🟢 🟢
centos-3.10 🟢 🟢 🟢 🟡 🟡 🟡
centos-4.18 🟢 🟢 🟢 🟢 🟢 🟢
centos-5.14 🟢 🟢 🟢 🟢 🟢 🟢
fedora-5.17 🟢 🟢 🟢 🟢 🟢 🟢
fedora-5.8 🟢 🟢 🟢 🟢 🟢 🟢
fedora-6.2 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-3.10 🟢 🟢 🟢 🟡 🟡 🟡
oraclelinux-4.14 🟢 🟢 🟢 🟢 🟢 🟡
oraclelinux-5.15 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-5.4 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-4.15 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-5.8 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-6.5 🟢 🟢 🟢 🟢 🟢 🟢

ARM64 kernel testing matrix

KERNEL CMAKE-CONFIGURE KMOD BUILD KMOD SCAP-OPEN BPF-PROBE BUILD BPF-PROBE SCAP-OPEN MODERN-BPF SCAP-OPEN
amazonlinux2-5.4 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2022-5.15 🟢 🟢 🟢 🟢 🟢 🟢
fedora-6.2 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-4.14 🟢 🟢 🟢 🟡 🟡 🟡
oraclelinux-5.15 🟢 🟢 🟢 🟢 🟢 🟢
ubuntu-6.5 🟢 🟢 🟢 🟢 🟢 🟢

Signed-off-by: Andrea Terzolo <[email protected]>
Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Nov 26, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana
Copy link
Contributor

poiana commented Nov 26, 2024

LGTM label has been added.

Git tree hash: 365ac9f3ee2cccf9f1daadef2b89e5026f27c57f

@poiana poiana merged commit a99a365 into master Nov 26, 2024
60 of 62 checks passed
@poiana poiana deleted the fix_modern_6_12_0 branch November 26, 2024 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants