Add Support for Compressible Data Generation #45
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This pull requests adds a compress-percent option that allows ACT to send data with a specified target compression ratio for storage workloads. The compress-percent option works similar to the buffer_compress_percentage option in FIO. For example, setting compress-percent to 40 will configure ACT to send data that should compress to 40% of its original size. The default compress-percent value is 100, corresponding to fully random data (not compressible). Values greater than 100 are not permitted. Similar to FIO, the data is made compressible by adding runs of zeros to random data. The runs of zeros are added at 512 byte intervals in the buffer in order to prevent de-duplication or zeros truncation from skewing the compression ratio. Because run-length encoding techniques provide virtually all of the compression for the generated data pattern, similar compression ratios should be observed for most common algorithms (gzip, LZ4, Snappy, etc.).
Motivation
Some SSDs support in-line compression. In-line compression can reduce tail latency by lowering the amount of write-to-read interference within the SSD. By providing an option within ACT to send compressible data, one can measure the effects of compression on tail latency under different workload assumptions.