fix(ledger): fuzz testing #452

dadepo · 2024-12-18T14:30:26Z

No description provided.

dadepo · 2024-12-19T17:47:37Z

$ zig build fuzz -- ledger_rocksdb 10 1_000_000 > output.txt 2>&1
$ grep -c 'Put actions' output.txt
890
$ grep -c 'Delete actions' output.txt
889
$ grep -c 'Delete Files in Range actions' output.txt
925
$ grep -c 'getBytes actions' output.txt
10
$ grep -c 'get actions' output.txt
11
$ grep -c 'count actions' output.txt
997
$ grep -c 'contains actions' output.txt
11
$ grep -c 'Batch actions' output.txt
904

get, getBytes and contains are order of magnitude slower. Need to investigate

Update: Only running the get and getBytes leads to

$ grep -c 'get actions' output.txt
988
$ grep -c 'getBytes actions' output.txt
978

Update: This is related to try batch.deleteRange(cf1, start, end);. It seems deleteRange call negatively impacts the get calls. This would be investigated in a follow up PR.

0xNineteen

one thing thats great about this is that it performs many actions in parallel and if something errors it will break and catch it - however, theres no consistency check that the values which are read, are what they should be - which i think we care about - since we have multiple database backends, i think this should be simple to do: perform the same operations and after each 'step' ensure they are returning equivalent values -- the accountsdb fuzzer does this which could be a reference

src/ledger/fuzz.zig

dnut

Ideally a fuzzer would be deterministic and reproducible. When a fuzz test fails you should be able to use the same seed to trigger the same error and debug what went wrong. But in this case, the multithreaded behavior is random and not reproducible by using the same seed. I do see the value of identifying issues with concurrency though, and I'm not sure how to test that in a reproducible way. Any thoughts?

Also I noticed stdout is flooded very fast with a ton of text when running these fuzz tests. It might be good to slow that down a bit. Normally, for fuzz tests, I would use stdout to print one line for each discrete test case that can be individually reproduced using a particular seed that is included on that line. Something on the order of 1 log message per second seems reasonable.

This test only deals with the database implementation which is a pretty small part of the ledger. In the future we should write some fuzz tests for the ShredInserter and BlockstoreReader, which is where I would expect to see more problems than RocksDB for example.

src/ledger/fuzz.zig

0xNineteen

lgtm

dadepo · 2025-01-14T16:06:13Z

Ideally a fuzzer would be deterministic and reproducible. When a fuzz test fails you should be able to use the same seed to trigger the same error and debug what went wrong. But in this case, the multithreaded behavior is random and not reproducible by using the same seed. I do see the value of identifying issues with concurrency though, and I'm not sure how to test that in a reproducible way. Any thoughts?

I think randomizing the order the methods are called is a good middle ground. It achieves the same objective behind going with the multithreading.

Also I noticed stdout is flooded very fast with a ton of text when running these fuzz tests. It might be good to slow that down a bit. Normally, for fuzz tests, I would use stdout to print one line for each discrete test case that can be individually reproduced using a particular seed that is included on that line. Something on the order of 1 log message per second seems reasonable.

This should be addressed now. I got the idea for logging every 1000 count from another fuzzer in the codebase. But I've dropped it now.

This test only deals with the database implementation which is a pretty small part of the ledger. In the future we should write some fuzz tests for the ShredInserter and BlockstoreReader, which is where I would expect to see more problems than RocksDB for example.

That is indeed true and that was partially intentionally, since the mis-compilation issue was reproducible with only the ledger implementation. But I do agree other parts of the ledger should be included...and as you suggested that can be done in follow up PRs.

dnut · 2025-01-14T20:08:08Z

Also I noticed stdout is flooded very fast with a ton of text when running these fuzz tests. It might be good to slow that down a bit. Normally, for fuzz tests, I would use stdout to print one line for each discrete test case that can be individually reproduced using a particular seed that is included on that line. Something on the order of 1 log message per second seems reasonable.

This should be addressed now. I got the idea for logging every 1000 count from another fuzzer in the codebase. But I've dropped it now.

You've removed all printing. I was suggesting that there should be discrete test cases that are actually printed.

Here's the problem I'm trying to address. I just ran the fuzzer for a long time. Maybe 20 minutes. The fuzzer eventually crashed due to an error in a test. I'd like to reproduce this, but the only way to do that is by running the fuzzer again for 20+ minutes. Identifying the cause of this failure will not be easy.

If instead the fuzzer ran many distinct and isolated test cases, and it printed the seed for each case, it would be easy to rerun a single one of those cases when they fail. You can look at the allocator fuzzer to see what I mean.

dadepo · 2025-01-15T12:08:40Z

If instead the fuzzer ran many distinct and isolated test cases, and it printed the seed for each case, it would be easy to rerun a single one of those cases when they fail. You can look at the allocator fuzzer to see what I mean.

The allocator fuzzer seems to have a different structure. It allows fuzzing two different implementations (disk or batch) or both and provides ability to select which to run.

This is different from the structure for the ledger fuzzing as it fuzzes various methods on the same ledger implementation. And the pattern used for it is similar to what is used for the accounts db, and gossip fuzzer.

Here's the problem I'm trying to address. I just ran the fuzzer for a long time. Maybe 20 minutes. The fuzzer eventually crashed due to an error in a test. I'd like to reproduce this, but the only way to do that is by running the fuzzer again for 20+ minutes. Identifying the cause of this failure will not be easy.

I am not sure if it is possible to shortcut this, as it may be that it requires that much time for the random interaction of the methods being fuzzed to trigger the issue.

dadepo · 2025-01-15T12:10:47Z

I got the idea for logging every 1000 count from another fuzzer in the codebase. But I've dropped it now.

Checked again. This is was from gossip_fuzz_service.

dnut · 2025-01-15T13:07:49Z

The allocator fuzzer seems to have a different structure. It allows fuzzing two different implementations (disk or batch) or both and provides ability to select which to run.

This is not the relevant part of the fuzzer that I'm talking about. You can say the same thing about the ledger fuzzer allowing multiple databases. It just happens to deal with that abstraction at comptime whereas the allocator fuzzer does it at runtime. But this is a separate concern.

This is different from the structure for the ledger fuzzing as it fuzzes various methods on the same ledger implementation. And the pattern used for it is similar to what is used for the accounts db, and gossip fuzzer.

This is exactly the same pattern followed by the allocator fuzzer. But that's also not relevant. You can structure any fuzzing approach as a sequence of discrete test cases.

Here's the problem I'm trying to address. I just ran the fuzzer for a long time. Maybe 20 minutes. The fuzzer eventually crashed due to an error in a test. I'd like to reproduce this, but the only way to do that is by running the fuzzer again for 20+ minutes. Identifying the cause of this failure will not be easy.

I am not sure if it is possible to shortcut this, as it may be that it requires that much time for the random interaction of the methods being fuzzed to trigger the issue.

I agree it's not going to be possible to shortcut that specific scenario. The idea is to create different scenarios... scenarios that are easily reproducible

Currently you have something like this:

const random = rng(seed).random();
while (true) {
     runIteration(random);
}

instead it could look like something like this:

while (true) {
     seed += 1;
     print("using seed: {}", .{seed});
     const random = rng(seed).random();
     for (0..test_case_size) |_| {
          runIteration(random);
     }
}

dadepo · 2025-01-15T14:39:24Z

instead it could look like something like this:

while (true) {
     seed += 1;
     print("using seed: {}", .{seed});
     const random = rng(seed).random();
     for (0..test_case_size) |_| {
          runIteration(random);
     }
}

I am not sure I follow the intended benefit here. Given that the number of runs depends on test_case_size what this modification does is to add 1 to the provided seed and used that to run the iteration.

If the intention is to get to the seed that caused the issue faster, I am also not sure if the proposal solves the original problem statement. Given that if an error is triggered at seed:10, that does not make it easier to be reproduced using that seed:10 directly. This is because the fuzzer is stateful, and the error case that appeared at seed:10 might be because of the 9 previous seed runs.

Also to be sure I am on the same page with the proposed improvement, is the current ledger fuzzer implementation similarly in structure to exiting fuzzer like accounts_db and gossip? and is the proposed modification something that would also need to be applied to existing fuzzer? If that is the case, what about discussing the proposed improvement separately (as I am still not sure I follow how it solves the identified problem) - and then apply to all the other fuzzers if it improves things?

dnut · 2025-01-15T15:30:07Z

It solves the problem if you reset the state for each test case. I was just giving a basic pseudocode example to express the general idea. I wasn't trying to write the code that solves the problem in its entirety. At that point I think it's more efficient to just make the change in the code, so I committed it already.

Regarding gossip and accountsdb, I do think those fuzzers are flawed and need to be reworked, but that's not in scope right now. I just think we should not be adding more fuzzers that have the same flaw.

didn't realize old feedback was still not addressed

dadepo · 2025-01-15T15:51:33Z

@dnut I think the recent changes might not be working as intended.

Running

zig build fuzz -Dblockstore=rocksdb -- ledger 10 10 -freference-traced

continually prints:

action_name reached max actions: 10
action_name reached max actions: 10
action_name reached max actions: 10

Until I kill the process.

Also the action_name bit is not providing the intended information on which action is being executed.

dnut · 2025-01-15T16:49:23Z

My mistake. I thought you added this to CI. I'll open a pr to add it to CI and fix the loop issue.

dadepo · 2025-01-15T17:25:18Z

src/ledger/fuzz.zig

+        // the method calls being fuzzed return expected data.
+        var data_map = std.AutoHashMap(u32, Data).init(allocator);
+        defer data_map.deinit();
+        for (0..1_000) |_| {


@dnut The benefit of this change is that the seed used is not just the one passed in. The observation though is that, this fixes the iteration for each seed at 1000. - which may seem arbitrary.

It would probably be an improvement to have this configurable (which will differ from the other fuzzer) or

Do not change the seed value like this put have a random value generated by CI.

dadepo marked this pull request as ready for review December 20, 2024 17:13

Added fuzz for rockdb implementation for ledger

06644b4

dadepo force-pushed the dade/ledger-fuzz-test branch from bcf058d to 06644b4 Compare December 20, 2024 17:20

Merge branch 'main' into dade/ledger-fuzz-test

e6fd9d3

dadepo requested a review from dnut December 20, 2024 17:40

dadepo added 2 commits January 1, 2025 19:03

Merge branch 'main' into dade/ledger-fuzz-test

3946c33

Fix build

c86db1f

dnut linked an issue Jan 2, 2025 that may be closed by this pull request

test(ledger): add fuzz tests #470

Closed

0xNineteen assigned dadepo Jan 3, 2025

0xNineteen requested changes Jan 6, 2025

View reviewed changes

dnut requested changes Jan 7, 2025

View reviewed changes

dadepo added 14 commits January 7, 2025 20:23

Have all imports at top of file

c7efe51

Zig fmt

efd4025

Remove explicit type annotation

52bbbb2

Remove concurrency

b419b4b

No need to divide up the provided max action

e73776c

Assert result of ledger calls

9e06b51

Pass random as value

3e07071

Added dbDeleteFilesInRange

3917f2c

Use method syntax

50fdff8

Skip count for rocksdb impl

5c3a5f9

Rename batchOps to BatchAPI

db3c374

Fix style

0a0ac38

Use if as expression

9cfe206

Remove explicit reference to rocksdb

076d4b3

dadepo requested review from 0xNineteen and dnut January 10, 2025 14:48

0xNineteen requested changes Jan 10, 2025

View reviewed changes

src/ledger/fuzz.zig Outdated Show resolved Hide resolved

src/ledger/fuzz.zig Outdated Show resolved Hide resolved

src/ledger/fuzz.zig Outdated Show resolved Hide resolved

src/ledger/fuzz.zig Outdated Show resolved Hide resolved

Randomize method calls

c695ce7

dadepo added 5 commits January 13, 2025 11:21

Renamed variable and added documentation

e29fef7

Added comments explaining why deleteFilesInRange is not included

a84612b

Comment

0688ef9

Use the defined enum

de89d3c

Use just {} to represent the void value

ce9f552

0xNineteen previously approved these changes Jan 13, 2025

View reviewed changes

test(ledger): discrete fuzzer test cases

26df2aa

dnut dismissed 0xNineteen’s stale review via 26df2aa January 15, 2025 15:29

Merge branch 'main' into dade/ledger-fuzz-test

d16a81e

dnut previously approved these changes Jan 15, 2025

View reviewed changes

dnut added 2 commits January 15, 2025 12:36

use std.Random not std.rand

c797b8a

test(ledger): key should sometimes found, sometimes not

efba458

dnut enabled auto-merge (squash) January 15, 2025 15:43

dnut approved these changes Jan 15, 2025

View reviewed changes

dnut merged commit fbc554e into main Jan 15, 2025
8 checks passed

dnut deleted the dade/ledger-fuzz-test branch January 15, 2025 15:57

dadepo commented Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ledger): fuzz testing #452

fix(ledger): fuzz testing #452

dadepo commented Dec 18, 2024 •

edited

Loading

dadepo commented Dec 19, 2024 •

edited

Loading

0xNineteen left a comment

dnut left a comment

0xNineteen left a comment

dadepo commented Jan 14, 2025

dnut commented Jan 14, 2025

dadepo commented Jan 15, 2025

dadepo commented Jan 15, 2025

dnut commented Jan 15, 2025 •

edited

Loading

dadepo commented Jan 15, 2025 •

edited

Loading

dnut commented Jan 15, 2025

dadepo commented Jan 15, 2025

dnut commented Jan 15, 2025

dadepo Jan 15, 2025

fix(ledger): fuzz testing #452

fix(ledger): fuzz testing #452

Conversation

dadepo commented Dec 18, 2024 • edited Loading

dadepo commented Dec 19, 2024 • edited Loading

0xNineteen left a comment

Choose a reason for hiding this comment

dnut left a comment

Choose a reason for hiding this comment

0xNineteen left a comment

Choose a reason for hiding this comment

dadepo commented Jan 14, 2025

dnut commented Jan 14, 2025

dadepo commented Jan 15, 2025

dadepo commented Jan 15, 2025

dnut commented Jan 15, 2025 • edited Loading

dadepo commented Jan 15, 2025 • edited Loading

dnut commented Jan 15, 2025

dadepo commented Jan 15, 2025

dnut commented Jan 15, 2025

dadepo Jan 15, 2025

Choose a reason for hiding this comment

dadepo commented Dec 18, 2024 •

edited

Loading

dadepo commented Dec 19, 2024 •

edited

Loading

dnut commented Jan 15, 2025 •

edited

Loading

dadepo commented Jan 15, 2025 •

edited

Loading