Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(driver): simplify exe_upper_layer extraction #1960

Merged
merged 3 commits into from
Jul 23, 2024

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind bug

/kind cleanup

Any specific area of the project related to this PR?

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

This PR provides a fix for falcosecurity/falco#3276.

The page fault was due to a backport in RHEL9.

More in detail this commit torvalds/linux@0af950f was backported to kernel versions older than 6.5.

So when our drivers run this code:

#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 16, 0)
			has_upper = oe->has_upper;
#elif LINUX_VERSION_CODE < KERNEL_VERSION(6, 5, 0)
			has_upper = test_bit(OVL_E_UPPER_ALIAS, &(oe->flags));
#else
			has_upper = test_bit(OVL_E_UPPER_ALIAS, (unsigned long*)&oe);
#endif

they enter in the #elif LINUX_VERSION_CODE < KERNEL_VERSION(6, 5, 0), they try to dereference oe->flags but the field flags is no more there. This is bad for every driver because we are reading random memory but it is even worse for the kmod because it could lead to undefined behavior (and so a host crash) if the address is not aligned.

Even with the new approach proposed with this patch, we are still subject to possible crashes... if the layout of the struct ovl_inode changes, and the __upperdentry field is not immediately after the vfs_inode field, the following code could cause undefined behaviors:

        // Pointer arithmetics due to unexported ovl_inode struct
	// warning: this works if and only if the dentry pointer
	// is placed right after the inode struct
	upper_dentry = (struct dentry *)(vfs_inode + sizeof(struct inode));

Just for reference, this is the actual layout of the struct ovl_inode

struct ovl_inode {
  ...
  struct inode vfs_inode;
  struct dentry *__upperdentry;
  ...
}

Which issue(s) this PR fixes:

Ref falcosecurity/falco#3276

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix(driver): simplify exe_upper_layer extraction

Copy link

Perf diff from master - unit tests

     5.48%     -1.13%  [.] sinsp_thread_manager::find_thread
     7.18%     -1.01%  [.] sinsp::next
     4.08%     +0.94%  [.] sinsp_parser::process_event
     4.23%     +0.86%  [.] sinsp_evt::get_type
     9.38%     +0.79%  [.] sinsp_parser::reset
     2.57%     -0.63%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     0.99%     +0.44%  [.] sinsp_evt::get_ts
     0.94%     -0.44%  [.] libsinsp::events::is_unknown_event
     0.84%     -0.44%  [.] sinsp_threadinfo::sinsp_threadinfo
     0.58%     +0.43%  [.] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_weak_release

Perf diff from master - scap file

    12.13%     +7.67%  [.] sinsp_filter_check::extract_nocache
    11.46%     -5.58%  [.] sinsp_thread_manager::find_thread
    10.19%     -4.71%  [.] 0x00000000000a70d0
    12.51%     -4.30%  [.] sinsp_filter_check::tostring
     9.22%     -4.07%  [.] sinsp_filter_check_thread::extract_single
    12.34%     -2.18%  [.] sinsp_filter_check_event::extract_single
    15.18%     -2.08%  [.] sinsp_evt_formatter::tostring_withformat
     6.17%     -1.41%  [.] sinsp::next
     4.74%     -1.15%  [.] rawstring_check::extract_single
     3.02%     -0.17%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node

Heap diff from master - unit tests

total runtime: 0.03s.
calls to allocation functions: -435 (-16730/s)
temporary memory allocations: 362 (13923/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

total runtime: -0.03s.
calls to allocation functions: 0 (0/s)
temporary memory allocations: 1 (-40/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Copy link

Please double check driver/SCHEMA_VERSION file. See versioning.

/hold

@Andreagit97
Copy link
Member Author

/hold

we still have an issue in the kmod, it seems like upper_dentry = (struct dentry *)(vfs_inode + sizeof(struct inode)); is pointing to random memory... so we cause segmentation fault accessing it. The curious thing is that in ebpf we use exactly the same code but we are able to retrieve the correct data.

#endif
return false;

// WARNING: this could cause undefined behavior if the upper dentry is not immediately after the vfs_inode
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is exactly what is happening, not clear why...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @FedeDP for the help!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was a random access memory due to a wrong pointer management

@Andreagit97 Andreagit97 marked this pull request as draft July 17, 2024 07:25
Signed-off-by: Andrea Terzolo <[email protected]>
Co-authored-by: Federico Di Pierro <[email protected]>
@poiana poiana added size/XXL and removed size/XL labels Jul 22, 2024
Copy link

Perf diff from master - unit tests

    10.24%     +1.42%  [.] sinsp_parser::reset
     4.92%     +0.72%  [.] sinsp_evt::get_type
     1.14%     -0.60%  [.] sinsp_evt::get_ts
     1.13%     -0.59%  [.] sinsp_parser::event_cleanup
     1.32%     -0.58%  [.] sinsp_threadinfo::~sinsp_threadinfo
     4.03%     +0.54%  [.] sinsp_evt::load_params
     1.47%     -0.53%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>
     2.76%     +0.49%  [.] sinsp_thread_manager::get_thread_ref
     5.17%     -0.46%  [.] next
     1.56%     -0.39%  [.] 0x00000000000e83b0

Perf diff from master - scap file

     9.87%     -3.38%  [.] 0x00000000000a76c4
    12.58%     -2.63%  [.] sinsp::next
     8.07%     -2.34%  [.] sinsp_filter_check::apply_transformers
    16.51%     -2.25%  [.] sinsp_filter_check::extract_nocache
    11.48%     +1.58%  [.] sinsp_filter_check::tostring
    11.06%     +1.34%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     4.77%     +1.31%  [.] sinsp_parser::process_event
     3.82%     +1.02%  [.] sinsp_evt::get_type
     3.72%     -0.26%  [.] gzfile_read
     3.82%     -0.14%  [.] next

Heap diff from master - unit tests

total runtime: -0.04s.
calls to allocation functions: 740 (-17209/s)
temporary memory allocations: 397 (-9232/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

total runtime: -0.00s.
calls to allocation functions: 0 (0/s)
temporary memory allocations: 6 (-2000/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Copy link

codecov bot commented Jul 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (master@0a9ede3). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff            @@
##             master    #1960   +/-   ##
=========================================
  Coverage          ?   51.00%           
=========================================
  Files             ?      310           
  Lines             ?    39570           
  Branches          ?    17714           
=========================================
  Hits              ?    20183           
  Misses            ?    14366           
  Partials          ?     5021           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member Author

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, now it should be ready for review!

Comment on lines -2873 to -2877
#if LINUX_VERSION_CODE < KERNEL_VERSION(4, 18, 0)
struct super_block* sb = _READ(dentry->d_sb);
#else
struct super_block *sb = _READ(inode->i_sb);
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@therealbobo @loresuso It's unclear why we need this distinction...why cannot we take the superblock always from the dentry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR LGTM! Thanks andre!
Once ☝️ gets an answer, i am ready to approve :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember it was a bit of trial and error. Something changed in OverlayFS in kernel 4.18 but I can't recall at the moment, trying to look around and I'll keep you posted.
Do we have tests to run if we try to get it always from the dentry? We could try that in the meantime in my opinion, to check whether it is working as expected

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes in our current CI, we have 2 tests for it that are running on a kernel version > 4.18... Moreover, I tested it on at least 3 different machines during the implementation and this seems fine 🤔 I expect the dentry always to have a reference to the superblock, am I missing something?

Copy link
Member

@loresuso loresuso Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the dentry should always have a reference to the superblock. Anyway, is it hard to add another kernel version for testing that is less than 4.18?
Coverage for this would be really helpful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately at the moment, we don't have a setup for these machines. But in this case, I touched the ifdef for kernels >= 4.18 so the kernels < 4.18 should be untouched

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember the exact reason and it was the outcome of some trial and error. BTW I tested (with the driver_tests framework) your implementation with 4.14, 4.17, 4.20, 5.4, 6.2,6.9 e 6.10 kernels and it seems to work just fine 😄

@Andreagit97 Andreagit97 marked this pull request as ready for review July 22, 2024 15:20
@Andreagit97 Andreagit97 changed the title [WIP] fix(driver): simplify exe_upper_layer extraction fix(driver): simplify exe_upper_layer extraction Jul 22, 2024
@poiana poiana requested a review from FedeDP July 22, 2024 15:20
@deepskyblue86
Copy link
Member

Mentioning #1409 to have github link them, and have more context.

@Andreagit97
Copy link
Member Author

/hold

Copy link

Perf diff from master - unit tests

     3.04%     +1.33%  [.] sinsp_thread_manager::get_thread_ref
     3.48%     -0.97%  [.] sinsp_thread_manager::find_thread
     1.25%     +0.84%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     7.71%     +0.81%  [.] sinsp::next
     1.79%     -0.71%  [.] libsinsp::events::is_unknown_event
     1.42%     -0.66%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::find
     5.22%     -0.62%  [.] next
     0.16%     +0.55%  [.] libsinsp::runc::match_container_id
     3.52%     +0.49%  [.] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release
     0.71%     -0.46%  [.] sinsp_container_manager::resolve_container

Perf diff from master - scap file

    30.94%     -5.41%  [.] sinsp_evt_formatter::tostring_withformat
    19.37%     +4.26%  [.] sinsp_filter_check::extract_nocache
     3.87%     +3.38%  [.] sinsp_filter_check::tostring
     3.34%     +1.67%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     3.87%     +1.55%  [.] sinsp_parser::event_cleanup
     3.87%     -0.47%  [.] gzfile_read
     3.85%     -0.45%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     7.74%     -0.42%  [.] sinsp_parser::reset
     3.88%     -0.26%  [.] sinsp_filter_check_thread::extract_single
     3.88%     -0.22%  [.] 0x00000000000a7fe0

Heap diff from master - unit tests

total runtime: -0.04s.
calls to allocation functions: 163 (-3622/s)
temporary memory allocations: -2 (44/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

total runtime: 0.01s.
calls to allocation functions: 0 (0/s)
temporary memory allocations: -8 (-1000/s)
peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

@Andreagit97
Copy link
Member Author

Driver test results: https://github.com/falcosecurity/libs/actions/runs/10056427915

x86

KERNEL CMAKE-CONFIGURE KMOD BUILD KMOD SCAP-OPEN BPF-PROBE BUILD BPF-PROBE SCAP-OPEN MODERN-BPF SCAP-OPEN
amazonlinux2-4.19 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2-5.10 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2-5.15 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2-5.4 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2022-5.15 🟢 🟢 🟢 🟢 🟢 🟢
amazonlinux2023-6.1 🟢 🟢 🟢 🟢 🟢 🟢
archlinux-6.0 🟢 🟢 🟢 🟢 🟢 🟢
archlinux-6.7 🟢 🟢 🟢 🟢 🟢 🟢
centos-3.10 🟢 🟢 🟢 🟡 🟡 🟡
centos-4.18 🟢 🟢 🟢 🟢 🟢 🟢
centos-5.14 🟢 🟢 🟢 🟢 🟢 🟢
fedora-5.17 🟢 🟢 🟢 🟢 🟢 🟢
fedora-5.8 🟢 🟢 🟢 🟢 🟢 🟢
fedora-6.2 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-3.10 🟢 🟢 🟢 🟡 🟡 🟡
oraclelinux-4.14 🟢 🟢 🟢 🟢 🟢 🟡
oraclelinux-5.15 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-5.4 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-4.15 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-5.8 🟢 🟢 🟢 🟢 🟢 🟡
ubuntu-6.5 🟢 🟢 🟢 🟢 🟢 🟢

arm64

KERNEL CMAKE-CONFIGURE KMOD BUILD KMOD SCAP-OPEN BPF-PROBE BUILD BPF-PROBE SCAP-OPEN MODERN-BPF SCAP-OPEN
amazonlinux2-5.4 🟢 🟢 🟢 🟢 🟢 🟡
amazonlinux2022-5.15 🟢 🟢 🟢 🟢 🟢 🟢
fedora-6.2 🟢 🟢 🟢 🟢 🟢 🟢
oraclelinux-4.14 🟢 🟢 🟢 🟡 🟡 🟡
oraclelinux-5.15 🟢 🟢 🟢 🟢 🟢 🟢
ubuntu-6.5 🟢 🟢 🟢 🟢 🟢 🟢

/unhold

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Jul 23, 2024

LGTM label has been added.

Git tree hash: b00a255627fc70caed6b0fd5733b3cadc6e620d5

@poiana
Copy link
Contributor

poiana commented Jul 23, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@FedeDP
Copy link
Contributor

FedeDP commented Jul 31, 2024

/milestone 0.17.3

@poiana poiana added this to the 0.17.3 milestone Jul 31, 2024
@FedeDP
Copy link
Contributor

FedeDP commented Jul 31, 2024

/milestone 7.2.1+driver

@poiana poiana modified the milestones: 0.17.3, 7.2.1+driver Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants