Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return Some for only one thread when calling remove() #1143

Merged
merged 5 commits into from
Dec 20, 2024

Conversation

Coder-256
Copy link
Contributor

I've observed that SkipSet::remove() sometimes appears to behave non-atomically: when multiple threads concurrently remove the same key, sometimes multiple threads get a return value of Some.

The issue appears to be inside SkipList::remove(), as the call to self.search_position() races with the call to n.mark_tower(): if another thread marks n immediately before the call to n.mark_tower(), both threads will still return Some. This PR ensures that remove() returns None if the node was already marked.

I can confirm that I've been unable to reproduce the issue after applying this PR.

@taiki-e
Copy link
Member

taiki-e commented Dec 8, 2024

Thanks! Is it possible to add a test for the cases this would fix?

@taiki-e taiki-e added crossbeam-skiplist S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author labels Dec 8, 2024
@Coder-256
Copy link
Contributor Author

I've added a test. Unfortunately, since this is a race condition, it is highly architecture-dependent and flaky: it should never spuriously fail, but it might spuriously succeed (ie. fail to detect a regression).

On my machine (aarch64, Apple M1 Pro), without the changes in this PR, a typical result is:

$ cargo test -p crossbeam-skiplist --test base -- remove_race                                                07:43:16 PM
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.01s
     Running tests/base.rs (target/debug/deps/base-0efb01f4b8dd4af9)

running 1 test
test remove_race ... FAILED

failures:

---- remove_race stdout ----
thread 'remove_race' panicked at crossbeam-skiplist/tests/base.rs:951:5:
assertion `left == right` failed
  left: 139447
 right: 100000
stack backtrace:
   0: rust_begin_unwind
             at /rustc/a4cb3c831823d9baa56c3d90514b75b2660116fa/library/std/src/panicking.rs:681:5
   1: core::panicking::panic_fmt
             at /rustc/a4cb3c831823d9baa56c3d90514b75b2660116fa/library/core/src/panicking.rs:75:14
   2: core::panicking::assert_failed_inner
   3: core::panicking::assert_failed
             at /Users/jacobgreenfield/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/panicking.rs:364:5
   4: base::remove_race
             at ./tests/base.rs:951:5
   5: base::remove_race::{{closure}}
             at ./tests/base.rs:907:17
   6: core::ops::function::FnOnce::call_once
             at /Users/jacobgreenfield/.rustup/toolchains/nightly-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
   7: core::ops::function::FnOnce::call_once
             at /rustc/a4cb3c831823d9baa56c3d90514b75b2660116fa/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


failures:
    remove_race

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 23 filtered out; finished in 0.38s

error: test failed, to rerun pass `-p crossbeam-skiplist --test base`

Copy link
Member

@taiki-e taiki-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks again!

@taiki-e taiki-e merged commit 7121fbd into crossbeam-rs:master Dec 20, 2024
25 checks passed
@taiki-e taiki-e removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author label Dec 20, 2024
@Coder-256 Coder-256 deleted the remove-race branch December 20, 2024 13:56
al8n pushed a commit to al8n/crossbeam that referenced this pull request Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants