Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sinsp): fix fs.path filterchecks for relative paths (add dirfd concept) #1993

Merged
merged 8 commits into from
Sep 6, 2024

Conversation

incertum
Copy link
Contributor

@incertum incertum commented Aug 7, 2024

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

While working on the anomalydetection plugin i was looking into the fd fallbacks and noticed that there are a few cases off in libs.

  • fs.path fields incorrectly do not incorporate the concept of dirfd and as such they differ from the more correct fd.name value if applicable.
  • Missing dirfd handling for PPME_SYSCALL_OPEN_BY_HANDLE_AT_X throughout.

@LucaGuerra and @FedeDP (when you are back), could you help take a look? We all were involved in recent fixes in these areas. Thank you.
CC @mstemm

@leogr these fixes will result in subtle semantic changes of the affected filtercheck fields. Please note that these changes are corrections meaning the current values are off in the here addressed cases.

Currently still in WIP as I need more time to seriously improve our test cases throughout.

Which issue(s) this PR fixes:

Fixes #2039

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@incertum
Copy link
Contributor Author

incertum commented Aug 7, 2024

/milestone 0.18.0

@poiana poiana added this to the 0.18.0 milestone Aug 7, 2024
@@ -355,10 +355,100 @@ uint8_t* sinsp_filter_check_fspath::extract_single(sinsp_evt* evt, uint32_t* len

if(!std::filesystem::path(m_tstr).is_absolute())
{
m_tstr = sinsp_utils::concatenate_paths(tinfo->get_cwd(), m_tstr);
std::string sdir; // init
// Compare to `sinsp_filter_check_fd::extract_fdname_from_creator` logic
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing the relevant event I grepped out for easier review if the param number is correct

[PPME_SYSCALL_UNLINKAT_2_X] = {"unlinkat", EC_FILE | EC_SYSCALL, EF_NONE, 4, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"name", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"flags", PT_FLAGS32, PF_HEX, unlinkat_flags} } },
[PPME_SYSCALL_MKDIRAT_X] = {"mkdirat", EC_FILE | EC_SYSCALL, EF_NONE, 4, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"path", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"mode", PT_UINT32, PF_HEX} } },
[PPME_SYSCALL_OPENAT_2_X] = {"openat", EC_FILE | EC_SYSCALL, EF_CREATES_FD | EF_MODIFIES_STATE, 7, {{"fd", PT_FD, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"name", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"flags", PT_FLAGS32, PF_HEX, file_flags}, {"mode", PT_UINT32, PF_OCT}, {"dev", PT_UINT32, PF_HEX}, {"ino", PT_UINT64, PF_DEC} } },
[PPME_SYSCALL_FCHMODAT_X] = {"fchmodat", EC_FILE | EC_SYSCALL, EF_NONE, 4, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"filename", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"mode", PT_MODE, PF_OCT, chmod_mode} } },
[PPME_SYSCALL_OPENAT2_X] = {"openat2", EC_FILE | EC_SYSCALL, EF_CREATES_FD | EF_MODIFIES_STATE, 8, {{"fd", PT_FD, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"name", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"flags", PT_FLAGS32, PF_HEX, file_flags}, {"mode", PT_UINT32, PF_OCT}, {"resolve", PT_FLAGS32, PF_HEX, openat2_flags}, {"dev", PT_UINT32, PF_HEX}, {"ino", PT_UINT64, PF_DEC} } },
[PPME_SYSCALL_FCHOWNAT_X] = {"fchownat", EC_FILE | EC_SYSCALL, EF_NONE, 6, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"pathname", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"uid", PT_UINT32, PF_DEC}, {"gid", PT_UINT32, PF_DEC}, {"flags", PT_FLAGS32, PF_HEX, fchownat_flags}} },
[PPME_SYSCALL_MKNODAT_X] = {"mknodat", EC_OTHER | EC_SYSCALL, EF_USES_FD, 5, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"path", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)},{"mode", PT_MODE, PF_OCT, mknod_mode},{"dev", PT_UINT32, PF_DEC}}},
[PPME_SYSCALL_NEWFSTATAT_X] = {"newfstatat", EC_FILE | EC_SYSCALL, EF_USES_FD, 4, {{"res", PT_ERRNO, PF_DEC}, {"dirfd", PT_FD, PF_DEC}, {"path", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"flags", PT_FLAGS32, PF_HEX, newfstatat_flags}}},


[PPME_SYSCALL_OPENAT_2_E] = {"openat", EC_FILE | EC_SYSCALL, EF_CREATES_FD | EF_MODIFIES_STATE, 4, {{"dirfd", PT_FD, PF_DEC}, {"name", PT_FSRELPATH, PF_NA, DIRFD_PARAM(0)}, {"flags", PT_FLAGS32, PF_HEX, file_flags}, {"mode", PT_UINT32, PF_OCT} } },
[PPME_SYSCALL_OPENAT2_E] = {"openat2", EC_FILE | EC_SYSCALL, EF_CREATES_FD | EF_MODIFIES_STATE, 5, {{"dirfd", PT_FD, PF_DEC}, {"name", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"flags", PT_FLAGS32, PF_HEX, file_flags}, {"mode", PT_UINT32, PF_OCT}, {"resolve", PT_FLAGS32, PF_HEX, openat2_flags} } },


[PPME_SYSCALL_RENAMEAT2_X] = {"renameat2", EC_FILE | EC_SYSCALL, EF_NONE, 6, {{"res", PT_ERRNO, PF_DEC}, {"olddirfd", PT_FD, PF_DEC}, {"oldpath", PT_FSRELPATH, PF_NA, DIRFD_PARAM(1)}, {"newdirfd", PT_FD, PF_DEC}, {"newpath", PT_FSRELPATH, PF_NA, DIRFD_PARAM(3)}, {"flags", PT_FLAGS32, PF_HEX, renameat2_flags} } },

[PPME_SYSCALL_SYMLINKAT_X] = {"symlinkat", EC_FILE | EC_SYSCALL, EF_NONE, 4, {{"res", PT_ERRNO, PF_DEC}, {"target", PT_CHARBUF, PF_NA}, {"linkdirfd", PT_FD, PF_DEC}, {"linkpath", PT_FSRELPATH, PF_NA, DIRFD_PARAM(2)} } },

[PPME_SYSCALL_OPEN_BY_HANDLE_AT_X] = {"open_by_handle_at", EC_FILE | EC_SYSCALL, EF_CREATES_FD | EF_MODIFIES_STATE, 6, {{"fd", PT_FD, PF_DEC}, {"mountfd", PT_FD, PF_DEC}, {"flags", PT_FLAGS32, PF_HEX, file_flags}, {"path", PT_FSPATH, PF_NA}, {"dev", PT_UINT32, PF_HEX}, {"ino", PT_UINT64, PF_DEC} } },

As mentioned in the PR all still WIP since I need to expand the unit tests ...

Copy link

github-actions bot commented Aug 7, 2024

Perf diff from master - unit tests

   100.00%    -99.40%  [.] 0x00000000000e93c0

Heap diff from master - unit tests

peak heap memory consumption: -1.53K
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            +0.0666         +0.0666           143           153           143           153
BM_sinsp_split_median                                          +0.0682         +0.0681           143           153           143           153
BM_sinsp_split_stddev                                          +0.2155         +0.2141             1             1             1             1
BM_sinsp_split_cv                                              +0.1396         +0.1382             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  +0.0974         +0.0974            43            47            43            47
BM_sinsp_concatenate_paths_relative_path_median                +0.1167         +0.1167            42            47            42            47
BM_sinsp_concatenate_paths_relative_path_stddev                -0.8632         -0.8631             2             0             2             0
BM_sinsp_concatenate_paths_relative_path_cv                    -0.8753         -0.8752             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     +0.0218         +0.0218            17            17            17            17
BM_sinsp_concatenate_paths_empty_path_median                   +0.0162         +0.0161            17            17            17            17
BM_sinsp_concatenate_paths_empty_path_stddev                   +1.1413         +1.1426             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       +1.0955         +1.0969             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  +0.0310         +0.0310            47            48            47            48
BM_sinsp_concatenate_paths_absolute_path_median                +0.0302         +0.0302            47            48            47            48
BM_sinsp_concatenate_paths_absolute_path_stddev                +0.2470         +0.2469             1             1             1             1
BM_sinsp_concatenate_paths_absolute_path_cv                    +0.2095         +0.2094             0             0             0             0
BM_sinsp_split_container_image_mean                            +0.0103         +0.0103           348           352           348           352
BM_sinsp_split_container_image_median                          +0.0097         +0.0096           348           352           348           352
BM_sinsp_split_container_image_stddev                          -0.5911         -0.5908             5             2             5             2
BM_sinsp_split_container_image_cv                              -0.5953         -0.5950             0             0             0             0

Copy link

codecov bot commented Aug 7, 2024

Codecov Report

Attention: Patch coverage is 97.14286% with 4 lines in your changes missing coverage. Please review.

Project coverage is 74.10%. Comparing base (5ed00b2) to head (3312f37).
Report is 38 commits behind head on master.

Files with missing lines Patch % Lines
userspace/libsinsp/sinsp_filtercheck_fspath.cpp 92.68% 3 Missing ⚠️
userspace/libsinsp/parsers.cpp 96.29% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1993      +/-   ##
==========================================
- Coverage   74.30%   74.10%   -0.21%     
==========================================
  Files         253      254       +1     
  Lines       30966    31213     +247     
  Branches     5414     5442      +28     
==========================================
+ Hits        23010    23130     +120     
- Misses       7951     8082     +131     
+ Partials        5        1       -4     
Flag Coverage Δ
libsinsp 74.10% <97.14%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ei! Just an early feedback on open_by_handle_at syscall management

userspace/libsinsp/parsers.cpp Show resolved Hide resolved
@@ -2732,8 +2732,8 @@ void sinsp_parser::parse_open_openat_creat_exit(sinsp_evt *evt)
}
}

// since open_by_handle_at returns an absolute path we will always start at /
sdir = "";
int64_t dirfd_mountfd = evt->get_param(1)->as<int64_t>(); // mountfd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here in name we always return the full path from our drivers, so open_by_handle_at behaves differently, it doesn't use dirfd + relative path

name = evt->get_param(3)->as<std::string_view>();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah of course why not ... I wasn't aware how fragmented our open* approaches truly are and that we even do dpath traversals for open by handle at in the kernel ... one of these days I'll try to compile a few things in an issue, but it's not urgent for the upcoming release.

userspace/libsinsp/sinsp_filtercheck_fd.cpp Outdated Show resolved Hide resolved
Copy link

Perf diff from master - unit tests

    11.48%     -1.83%  [.] sinsp_parser::reset
     5.84%     +1.81%  [.] sinsp::next
     5.01%     +0.87%  [.] sinsp_evt::get_type
     1.04%     -0.65%  [.] sinsp_threadinfo::~sinsp_threadinfo
     0.47%     +0.64%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>
     2.13%     +0.63%  [.] scap_event_decode_params
     3.35%     +0.61%  [.] sinsp_thread_manager::get_thread_ref
     0.06%     +0.57%  [.] std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::_M_realloc_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >
     2.13%     -0.56%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     2.02%     -0.51%  [.] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release

Perf diff from master - scap file

     6.65%     +9.24%  [.] sinsp_evt_formatter::tostring_withformat
    13.41%     -4.40%  [.] sinsp_filter_check_event::extract_single
    10.03%     -3.94%  [.] sinsp_evt::get_type
     3.35%     +2.00%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
    10.04%     -1.17%  [.] sinsp_filter_check::rawval_to_string
     3.20%     +1.11%  [.] sinsp_filter_check::get_transformed_field_info
     3.31%     +0.86%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     6.72%     +0.49%  [.] sinsp_filter_check_thread::extract_single
     3.12%     +0.46%  [.] sinsp_evt::get_ts
     3.35%     -0.45%  [.] std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release

Heap diff from master - unit tests

peak heap memory consumption: -1.15K
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

{
// todo the cached fd.name might be subject to some std::string_view lifetime issues as the second path of the path concatenation
// is missing when retrieving fd.name from the cache. Let's check if string_view and std::string handling is done correctly throughout.
// ASSERT_STREQ(get_field_as_string(evt, fs_path_name).c_str(), get_field_as_string(evt, "fd.name").c_str());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@falcosecurity/libs-maintainers and also @federico-sysdig would appreciate some help to resolve some mystery (again) ....

So these tests (comparing the fd.name from the cache to the fs.path.name) are failing, possibly due to some std::string_view lifetime issues or other issues? Regardless am stuck debugging.

Sprinkled few print statements along the relevant code paths and we only loose the second part of the concatenated path when retrieving the name from the cache in the fd filterchecks. Before that it's all correct within the parsers logic.

[ RUN      ] fspath.openat_2_relative
parser name exit tmp/random/dir...///../../name.txt
parser name enter event tmp/random/dir...///../../name.txt
parser name fullpath /tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8/tmp/name.txt
parser name fullpath from fdi /tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8/tmp/name.txt
parser name fullpath from table cache fdi /tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8/tmp/name.txt
FDNAME fd 3 CACHE FILTERCHECK /tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8

[Btw the fallback in the filtercheck extract_from_null_fd is working as expected and these tests would pass. Also note that today we do not test fd.name for many scenarios ...]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Are you saying that you get a full path when using get_field_as_string(evt, fs_path_name) and only the first part of it when using get_field_as_string(evt, "fd.name")? Should they really be equal? Wouldn't it depend on the value of fs_path_name?
If there is an issue with a string view pointing at a string whose lifetime is over, that is indeed a serious issue, but it would be a problem within the machinery behind get_field_as_string.

Nit: The check can be a bit simpler:

ASSERT_EQ(get_field_as_string(evt, fs_path_name), get_field_as_string(evt, "fd.name"));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Incorporated your nit suggestion and also made sure the list of events where these 2 should be same is complete. Re-pushed the last commit.

(2) Plus let me share the 2 locations in the source code where the cached string that represents the fd full file path is not equal.

See below a print output for how

RETURN_EXTRACT_STRING(m_tstr);
m_tstr = m_fdinfo->m_name; looks like atm.

[ RUN      ] fspath.open

FILTERCHECK FDNAME FROM FD CACHE /tmp/name

[       OK ] fspath.open (0 ms)
[ RUN      ] fspath.openat
[       OK ] fspath.openat (0 ms)
[ RUN      ] fspath.openat_2
FILTERCHECK FDNAME FROM FD CACHE /tmp/name
[       OK ] fspath.openat_2 (0 ms)
[ RUN      ] fspath.openat_2_relative

FILTERCHECK FDNAME FROM FD CACHE /tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8

/home/m/Documents/OSS/oss-dev-master/libs/userspace/libsinsp/test/events_fspath.ut.cpp:114: Failure
Expected equality of these values:
  get_field_as_string(evt, fs_path_name)
    Which is: "/tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8/tmp/name.txt"
  get_field_as_string(evt, "fd.name")
    Which is: "/tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8"
[  FAILED  ] fspath.openat_2_relative (0 ms)
[ RUN      ] fspath.openat2

FILTERCHECK FDNAME FROM FD CACHE /tmp/name

[       OK ] fspath.openat2 (0 ms)
[ RUN      ] fspath.openat2_relative

FILTERCHECK FDNAME FROM FD CACHE /tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8

/home/m/Documents/OSS/oss-dev-master/libs/userspace/libsinsp/test/events_fspath.ut.cpp:114: Failure
Expected equality of these values:
  get_field_as_string(evt, fs_path_name)
    Which is: "/tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8/tmp/name.txt"
  get_field_as_string(evt, "fd.name")
    Which is: "/tmp/dirfd1/dirfd2/dirfd3/dirfd4/dirfd5/dirfd6/dirfd7/dirfd8"
[  FAILED  ] fspath.openat2_relative (0 ms)
[ RUN      ] fspath.openat2_relative_dirfd_cwd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm taking a look at this. I tried pulling your branch, rebasing it and running the tests but this one doesn't seem to be failing. Did you happen to remove the failing one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The failing test is commented out for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I took a deeper look at this. I believe the problem lies in the use of parse_dirfd(). Within that function it calls set_fd_info, replacing the current event fd with the dirfd one. By calling that function in the extract function it means that the thread state is affected at extract time, which is not what we want:

evt->set_fd_info(evt->get_tinfo()->get_fd(dirfd));

In fact, if you replace the verify_fields function with:

	void verify_fields(ppm_event_code event_type, sinsp_evt *evt,
			   const char *expected_name,
			   const char *expected_nameraw,
			   const char *expected_source,
			   const char *expected_sourceraw,
			   const char *expected_target,
			   const char *expected_targetraw)
	{
		switch (event_type)
		{
			case PPME_SYSCALL_OPENAT_2_X:
			case PPME_SYSCALL_OPENAT2_X:
			case PPME_SYSCALL_OPEN_X:
			case PPME_SYSCALL_OPEN_BY_HANDLE_AT_X:
				{
					std::string fd_name_value = get_field_as_string(evt, "fd.name");
					std::string fs_path_name_value = get_field_as_string(evt, fs_path_name);
					ASSERT_EQ(fs_path_name_value, fd_name_value);
				}
				break;	
			default:
				break;
		}
	}

You will notice it passes. But if you swap the order of the two statements std::string fd_name_value = get_field_as_string(evt, "fd.name"); and std::string fs_path_name_value = get_field_as_string(evt, fs_path_name); the test will fail.

So I think the approach we need is slightly different; most likely we will need to avoid reparsing the event in the extractor if possible or modifying this by not affecting the thread state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much 🚀 @LucaGuerra. Pushed a co-authored commit. Feel free to overwrite it if you have a better idea.
Also created a follow ticket here: #2039

The macos test is currently failing ...

@incertum incertum changed the title wip: fix(sinsp): fix fs.path filterchecks for relative paths (add dirfd concept) + fix PPME_SYSCALL_OPEN_BY_HANDLE_AT_X to use mount_fd as dirfd wip: fix(sinsp): fix fs.path filterchecks for relative paths (add dirfd concept) Aug 13, 2024
@incertum incertum force-pushed the fix-fs-path-openat branch from bfe211d to d9b0e8d Compare August 13, 2024 05:40
Copy link

Perf diff from master - unit tests

    11.37%     -2.25%  [.] sinsp_parser::reset
     5.78%     +1.34%  [.] sinsp::next
     2.65%     +1.03%  [.] sinsp_thread_manager::find_thread
     5.21%     -0.73%  [.] next
     0.46%     +0.58%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>
     1.05%     +0.54%  [.] sinsp_parser::event_cleanup
     1.26%     +0.45%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     1.16%     -0.42%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::find
     3.32%     +0.40%  [.] sinsp_thread_manager::get_thread_ref
     5.00%     +0.39%  [.] sinsp_parser::process_event

Perf diff from master - scap file

    17.38%     -4.92%  [.] sinsp_filter_check::tostring
    10.43%     +4.22%  [.] sinsp_filter_check::rawval_to_string
    10.42%     -3.52%  [.] sinsp_evt::get_type
     6.91%     +2.22%  [.] sinsp_evt_formatter::tostring_withformat
     3.45%     +2.21%  [.] main
     3.48%     +1.93%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     6.98%     -1.71%  [.] sinsp_filter_check_thread::extract_single
     6.84%     -1.67%  [.] libsinsp::runc::match_container_id
    13.93%     +0.50%  [.] sinsp_filter_check_event::extract_single
     3.33%     +0.20%  [.] sinsp_filter_check::get_transformed_field_info

Heap diff from master - unit tests

peak heap memory consumption: -1.15K
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

@incertum incertum force-pushed the fix-fs-path-openat branch from d9b0e8d to cf83174 Compare August 13, 2024 19:09
Copy link

Perf diff from master - unit tests

     5.77%     +0.96%  [.] sinsp::next
     4.99%     +0.89%  [.] sinsp_parser::process_event
    11.34%     -0.83%  [.] sinsp_parser::reset
     0.46%     +0.81%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>
     2.10%     -0.76%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_threadinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_threadinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node
     0.92%     +0.52%  [.] 0x00000000000e83c0
     5.20%     -0.52%  [.] next
     4.14%     -0.51%  [.] sinsp_evt::load_params
     0.70%     -0.50%  [.] libsinsp::runc::match_container_id
     4.95%     +0.45%  [.] sinsp_evt::get_type

Perf diff from master - scap file

     8.61%     -3.82%  [.] sinsp_filter_check::rawval_to_string
     2.85%     +3.05%  [.] main
     2.88%     +2.75%  [.] gzfile_read
    14.35%     -2.68%  [.] sinsp_filter_check::tostring
     8.63%     -2.20%  [.] sinsp_filter_check::get_field_info
     5.76%     -2.04%  [.] sinsp_filter_check_thread::extract_single
     5.71%     -2.00%  [.] sinsp_thread_manager::get_thread_ref
     5.71%     +1.23%  [.] sinsp_evt_formatter::tostring_withformat
    11.50%     +0.82%  [.] sinsp_filter_check_event::extract_single
     5.65%     -0.45%  [.] libsinsp::runc::match_container_id

Heap diff from master - unit tests

peak heap memory consumption: -1.15K
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

Heap diff from master - scap file

peak heap memory consumption: 0B
peak RSS (including heaptrack overhead): 0B
total memory leaked: 0B

@incertum incertum changed the title wip: fix(sinsp): fix fs.path filterchecks for relative paths (add dirfd concept) fix(sinsp): fix fs.path filterchecks for relative paths (add dirfd concept) Aug 20, 2024
incertum and others added 2 commits September 1, 2024 17:35
….path.* dirfd use cases

Co-authored-by: Luca Guerra <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
@@ -240,6 +240,56 @@ std::unique_ptr<sinsp_filter_check> sinsp_filter_check_fspath::allocate_new()
return ret;
}

std::string sinsp_filter_check_fspath::parse_dirfd_stateless(sinsp_evt *evt, std::string_view name, int64_t dirfd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i understand correctly, all we need is to skip the evt->set_fd_info(evt->get_tinfo()->get_fd(dirfd)); call in parse_dirfd, right?
I think we can just add a boolean optional parameter (that of course defaults to true to parse_dirfd, like:

std::string sinsp_parser::parse_dirfd(sinsp_evt *evt, std::string_view name, int64_t dirfd, bool update_tinfo=true)

and pass false when called from the filterchecks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, can do that. Are we sure we want one more if statement in the parsers logic?

@FedeDP
Copy link
Contributor

FedeDP commented Sep 5, 2024

@incertum pushed some changes, let me know wdyt (we can rework/remove the commit in case you don't really like it :D )

  • move parsing logic to parsers.cpp instead of parsing dirfd on the fly in filterchecks
  • made parse_dirfd completely stateless, it does not touch anything now
  • added the PPM_O_DIRECTORY flag in events_fspath when we create a dirfd folder, since that flag gets enforced by our drivers
  • another notable fix (that was not originally part of this PR): do not store an fdinfo in the event in sinsp_parser::parse_open_openat_creat_exit when the syscall failed; previously, the fdinfo related to the dirfd was stored even when the syscall failed. That was wrong.

Also cc @LucaGuerra that helped me!

…k_fspath to parsers.

Simplified a bit the whole logic.
Updated events_fspath tests adding the `PPM_O_DIRECTORY` flag as needed.

Signed-off-by: Federico Di Pierro <[email protected]>

Co-authored-by: Luca Guerra <[email protected]>
@FedeDP FedeDP force-pushed the fix-fs-path-openat branch from d2153c2 to ba835da Compare September 5, 2024 10:44
@@ -442,6 +442,33 @@ void sinsp_parser::process_event(sinsp_evt *evt)
case PPME_SYSCALL_PRCTL_X:
parse_prctl_exit_event(evt);
break;
case PPME_SYSCALL_NEWFSTATAT_X:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New parsers. I didn't add new methods for these small parsers.
Only parse when event are successful, and store in the event fdinfo the dirfd related fdinfo (that is already present in the thread table since it was previously added by an open related syscall)

@@ -2307,6 +2334,12 @@ void sinsp_parser::parse_execve_exit(sinsp_evt *evt)
*/
std::string sdir = parse_dirfd(evt, pathname, dirfd);

// Update event fdinfo since parse_dirfd is stateless
if (sdir != "." && sdir != "<UNKNOWN>")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since parse_dirfd is now stateless, honor previous behavior by setting the fdinfo when needed (ie: when previous logic set it)

{
add_event_advance_ts(increasing_ts(), 1, PPME_SYSCALL_OPENAT2_E, 2, evt_dirfd, dirfd_path);
// pass PPM_O_DIRECTORY since we are creating a folder!
add_event_advance_ts(increasing_ts(), 1, PPME_SYSCALL_OPENAT2_X, 5, evt_dirfd, evt_dirfd, dirfd_path, open_flags | PPM_O_DIRECTORY, mode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are creating a folder, enforce PPM_O_DIRECTORY otherwise our fdinfo will be off.

case PPME_SYSCALL_SYMLINKAT_X:
{
add_event_advance_ts(increasing_ts(), 1, PPME_SYSCALL_OPENAT2_E, 2, evt_dirfd, dirfd_path);
add_event_advance_ts(increasing_ts(), 1, PPME_SYSCALL_OPENAT2_X, 5, evt_dirfd, evt_dirfd, dirfd_path, open_flags | PPM_O_DIRECTORY, mode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are creating a folder, enforce PPM_O_DIRECTORY otherwise our fdinfo will be off.

…ostic `PPM_AT_FDCWD` value instead of the platform dependent one.

Signed-off-by: Federico Di Pierro <[email protected]>
@incertum
Copy link
Contributor Author

incertum commented Sep 5, 2024

@incertum pushed some changes, let me know wdyt (we can rework/remove the commit in case you don't really like it :D )

  • move parsing logic to parsers.cpp instead of parsing dirfd on the fly in filterchecks
  • made parse_dirfd completely stateless, it does not touch anything now
  • added the PPM_O_DIRECTORY flag in events_fspath when we create a dirfd folder, since that flag gets enforced by our drivers
  • another notable fix (that was not originally part of this PR): do not store an fdinfo in the event in sinsp_parser::parse_open_openat_creat_exit when the syscall failed; previously, the fdinfo related to the dirfd was stored even when the syscall failed. That was wrong.

Also cc @LucaGuerra that helped me!

Incredible work @FedeDP and @LucaGuerra, also thanks a lot for jumping in as we need this wrapped up soon.
Still up for discussion if we want to address #2039 here or in another PR or later.

Changes LGTM!

@FedeDP
Copy link
Contributor

FedeDP commented Sep 5, 2024

Still up for discussion if we want to address #2039 here or in another PR or later.

Aswered over there, imho there is no issue anymore since parse_dirfd is now stateless.

@incertum
Copy link
Contributor Author

incertum commented Sep 5, 2024

Nice @LucaGuerra thanks for adding more tests!

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Sep 5, 2024

LGTM label has been added.

Git tree hash: eedd25e41b16114378fcadd9c364bc8388189cd3

@poiana
Copy link
Contributor

poiana commented Sep 5, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP, incertum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit efa1df9 into falcosecurity:master Sep 6, 2024
44 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix fd filtercheck fallback in case of openat* events involving dirfd parsing
6 participants