-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ledger): fuzz testing #452
Conversation
Update: Only running the
Update: This is related to |
bcf058d
to
06644b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing thats great about this is that it performs many actions in parallel and if something errors it will break and catch it - however, theres no consistency check that the values which are read, are what they should be - which i think we care about - since we have multiple database backends, i think this should be simple to do: perform the same operations and after each 'step' ensure they are returning equivalent values -- the accountsdb fuzzer does this which could be a reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally a fuzzer would be deterministic and reproducible. When a fuzz test fails you should be able to use the same seed to trigger the same error and debug what went wrong. But in this case, the multithreaded behavior is random and not reproducible by using the same seed. I do see the value of identifying issues with concurrency though, and I'm not sure how to test that in a reproducible way. Any thoughts?
Also I noticed stdout is flooded very fast with a ton of text when running these fuzz tests. It might be good to slow that down a bit. Normally, for fuzz tests, I would use stdout to print one line for each discrete test case that can be individually reproduced using a particular seed that is included on that line. Something on the order of 1 log message per second seems reasonable.
This test only deals with the database implementation which is a pretty small part of the ledger. In the future we should write some fuzz tests for the ShredInserter and BlockstoreReader, which is where I would expect to see more problems than RocksDB for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
I think randomizing the order the methods are called is a good middle ground. It achieves the same objective behind going with the multithreading.
This should be addressed now. I got the idea for logging every 1000 count from another fuzzer in the codebase. But I've dropped it now.
That is indeed true and that was partially intentionally, since the mis-compilation issue was reproducible with only the ledger implementation. But I do agree other parts of the ledger should be included...and as you suggested that can be done in follow up PRs. |
You've removed all printing. I was suggesting that there should be discrete test cases that are actually printed. Here's the problem I'm trying to address. I just ran the fuzzer for a long time. Maybe 20 minutes. The fuzzer eventually crashed due to an error in a test. I'd like to reproduce this, but the only way to do that is by running the fuzzer again for 20+ minutes. Identifying the cause of this failure will not be easy. If instead the fuzzer ran many distinct and isolated test cases, and it printed the seed for each case, it would be easy to rerun a single one of those cases when they fail. You can look at the allocator fuzzer to see what I mean. |
The allocator fuzzer seems to have a different structure. It allows fuzzing two different implementations (disk or batch) or both and provides ability to select which to run. This is different from the structure for the ledger fuzzing as it fuzzes various methods on the same ledger implementation. And the pattern used for it is similar to what is used for the accounts db, and gossip fuzzer.
I am not sure if it is possible to shortcut this, as it may be that it requires that much time for the random interaction of the methods being fuzzed to trigger the issue. |
Checked again. This is was from |
This is not the relevant part of the fuzzer that I'm talking about. You can say the same thing about the ledger fuzzer allowing multiple databases. It just happens to deal with that abstraction at comptime whereas the allocator fuzzer does it at runtime. But this is a separate concern.
This is exactly the same pattern followed by the allocator fuzzer. But that's also not relevant. You can structure any fuzzing approach as a sequence of discrete test cases.
I agree it's not going to be possible to shortcut that specific scenario. The idea is to create different scenarios... scenarios that are easily reproducible Currently you have something like this: const random = rng(seed).random();
while (true) {
runIteration(random);
} instead it could look like something like this: while (true) {
seed += 1;
print("using seed: {}", .{seed});
const random = rng(seed).random();
for (0..test_case_size) |_| {
runIteration(random);
}
} |
I am not sure I follow the intended benefit here. Given that the number of runs depends on If the intention is to get to the seed that caused the issue faster, I am also not sure if the proposal solves the original problem statement. Given that if an error is triggered at Also to be sure I am on the same page with the proposed improvement, is the current ledger fuzzer implementation similarly in structure to exiting fuzzer like accounts_db and gossip? and is the proposed modification something that would also need to be applied to existing fuzzer? If that is the case, what about discussing the proposed improvement separately (as I am still not sure I follow how it solves the identified problem) - and then apply to all the other fuzzers if it improves things? |
It solves the problem if you reset the state for each test case. I was just giving a basic pseudocode example to express the general idea. I wasn't trying to write the code that solves the problem in its entirety. At that point I think it's more efficient to just make the change in the code, so I committed it already. Regarding gossip and accountsdb, I do think those fuzzers are flawed and need to be reworked, but that's not in scope right now. I just think we should not be adding more fuzzers that have the same flaw. |
didn't realize old feedback was still not addressed
@dnut I think the recent changes might not be working as intended. Running
continually prints:
Until I kill the process. Also the |
My mistake. I thought you added this to CI. I'll open a pr to add it to CI and fix the loop issue. |
// the method calls being fuzzed return expected data. | ||
var data_map = std.AutoHashMap(u32, Data).init(allocator); | ||
defer data_map.deinit(); | ||
for (0..1_000) |_| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dnut The benefit of this change is that the seed used is not just the one passed in. The observation though is that, this fixes the iteration for each seed at 1000. - which may seem arbitrary.
It would probably be an improvement to have this configurable (which will differ from the other fuzzer) or
Do not change the seed value like this put have a random value generated by CI.
No description provided.