-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-45045: [C++][Parquet] Add a benchmark for size_statistics_level #45085
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
The slowdown on lists is a bit surprising, is it because of levels histograms? Does the non-list case have nulls? |
Did you mean the slowdown from which one?
For 1, I think it is due to explosion of element sizes. The data size is large enough that most iteration numbers are 1, which may affect judgement.
Yes, the null probability is hard-coded to 50%. |
There must be some unstable test since the None encoding is slower...🤔 |
Yes, the iteration is 1 in most cases which is not stable. |
I have changed kBenchmarkSize from 10_000_000 to 1_000_000 and below is the result:
|
Ok, some observations:
|
I was to observe the page index size overhead compared to the file size. Now it seems to be pretty trivial and thus compression can be removed. |
Number of leaf values: 1024 * 1024
It is counter-intuitive that |
Thanks. By the way, can you disable dictionary encoding? Does it use PLAIN encoding? |
After disabling dictionary, the result looks more reasonable.
|
Do you want me to change anything? @pitrou From the result, I don't think we should enable size stats by default. cc @emkornfield |
Thanks for the benchmarks. am I reading this correct there is a about 30% regression for plain encoded integers but otherwise it does not add substantial overhead? If something like lz4 raw is used for compression do we still see such a large regression for integers? (I'd assume most people are doing at least some form of light-weight compression) |
@emkornfield I did an experiment with zstd before: #45085 (comment) |
|
||
// This should result in multiple pages for most primitive types | ||
constexpr int64_t kBenchmarkSize = 1024 * 1024; | ||
constexpr double kNullProbability = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a worst case for definition levels encoding. Perhaps 0.9 or 0.95 would exhibit the size statistics overhead even more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kNullProbability = 0.95
------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------------------------------
BM_WritePrimitiveColumn<SizeStatisticsLevel::None, ::arrow::Int64Type> 61819253 ns 61819750 ns 20 bytes_per_second=131.43Mi/s items_per_second=16.9618M/s output_size=546.091k page_index_size=33
BM_WritePrimitiveColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::Int64Type> 55140742 ns 54834400 ns 10 bytes_per_second=148.173Mi/s items_per_second=19.1226M/s output_size=546.107k page_index_size=33
BM_WritePrimitiveColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::Int64Type> 53924121 ns 53887400 ns 10 bytes_per_second=150.777Mi/s items_per_second=19.4586M/s output_size=546.121k page_index_size=47
BM_WritePrimitiveColumn<SizeStatisticsLevel::None, ::arrow::StringType> 51773791 ns 51774500 ns 10 bytes_per_second=89.4236Mi/s items_per_second=20.2527M/s output_size=864.083k page_index_size=30
BM_WritePrimitiveColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::StringType> 65489058 ns 65488500 ns 10 bytes_per_second=70.6973Mi/s items_per_second=16.0116M/s output_size=864.103k page_index_size=30
BM_WritePrimitiveColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::StringType> 65241288 ns 65241300 ns 10 bytes_per_second=70.9652Mi/s items_per_second=16.0723M/s output_size=864.122k page_index_size=44
BM_WriteListColumn<SizeStatisticsLevel::None, ::arrow::Int64Type> 72174783 ns 72174900 ns 10 bytes_per_second=118.289Mi/s items_per_second=14.5283M/s output_size=625.915k page_index_size=34
BM_WriteListColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::Int64Type> 102759675 ns 102760300 ns 10 bytes_per_second=83.0817Mi/s items_per_second=10.2041M/s output_size=625.937k page_index_size=34
BM_WriteListColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::Int64Type> 105034546 ns 105034000 ns 10 bytes_per_second=81.2832Mi/s items_per_second=9.98321M/s output_size=625.957k page_index_size=54
BM_WriteListColumn<SizeStatisticsLevel::None, ::arrow::StringType> 92049333 ns 92049200 ns 10 bytes_per_second=54.779Mi/s items_per_second=11.3915M/s output_size=944.123k page_index_size=31
BM_WriteListColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::StringType> 122477704 ns 122478100 ns 10 bytes_per_second=41.1695Mi/s items_per_second=8.56133M/s output_size=944.149k page_index_size=31
BM_WriteListColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::StringType> 121217775 ns 121217000 ns 10 bytes_per_second=41.5978Mi/s items_per_second=8.6504M/s output_size=944.174k page_index_size=51
Sorry for the delay @wgtmac . I posted a small suggestion but this looks good to me. Later, it would be good to investigate the cause of the overhead. It is surprising that there is a larger overhead for Int64 than for String. The numbers you posted in #45085 (comment) let me compute the following overheads (in ns/item):
|
Oh, wait, the benchmark is broken. Will push a fix to get more reasonable numbers. |
59c73a9
to
c34501e
Compare
Ok, I now get these numbers locally:
|
And now the size statistics overhead in ns/item is more consistent (and smaller!):
|
What about the null probability? 0.5 or 0.95? |
My latest push makes it 0.95. Can you check the changes look ok to you? |
Yes, it looks good. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will merge. Thanks @wgtmac !
…stics (#45202) ### Rationale for this change We found out in #45085 that there is a non-trivial overhead when writing size statistics is enabled. ### What changes are included in this PR? Dramatically reduce overhead by speeding up def/rep levels histogram updates. Performance results on the author's machine: ``` ------------------------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------------------------------------------------------------------ BM_WritePrimitiveColumn<SizeStatisticsLevel::None, ::arrow::Int64Type> 8103053 ns 8098569 ns 86 bytes_per_second=1003.26Mi/s items_per_second=129.477M/s output_size=537.472k page_index_size=33 BM_WritePrimitiveColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::Int64Type> 8153499 ns 8148492 ns 86 bytes_per_second=997.117Mi/s items_per_second=128.683M/s output_size=537.488k page_index_size=33 BM_WritePrimitiveColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::Int64Type> 8212560 ns 8207754 ns 83 bytes_per_second=989.918Mi/s items_per_second=127.754M/s output_size=537.502k page_index_size=47 BM_WritePrimitiveColumn<SizeStatisticsLevel::None, ::arrow::StringType> 10405020 ns 10400775 ns 67 bytes_per_second=444.142Mi/s items_per_second=100.817M/s output_size=848.305k page_index_size=34 BM_WritePrimitiveColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::StringType> 10464784 ns 10460778 ns 66 bytes_per_second=441.594Mi/s items_per_second=100.239M/s output_size=848.325k page_index_size=34 BM_WritePrimitiveColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::StringType> 10469832 ns 10465739 ns 67 bytes_per_second=441.385Mi/s items_per_second=100.191M/s output_size=848.344k page_index_size=48 BM_WriteListColumn<SizeStatisticsLevel::None, ::arrow::Int64Type> 13004962 ns 12992678 ns 52 bytes_per_second=657.101Mi/s items_per_second=80.7052M/s output_size=617.464k page_index_size=34 BM_WriteListColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::Int64Type> 13718352 ns 13705599 ns 50 bytes_per_second=622.921Mi/s items_per_second=76.5071M/s output_size=617.486k page_index_size=34 BM_WriteListColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::Int64Type> 13845553 ns 13832138 ns 52 bytes_per_second=617.222Mi/s items_per_second=75.8072M/s output_size=617.506k page_index_size=54 BM_WriteListColumn<SizeStatisticsLevel::None, ::arrow::StringType> 15715263 ns 15702707 ns 44 bytes_per_second=320.449Mi/s items_per_second=66.7768M/s output_size=927.326k page_index_size=35 BM_WriteListColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::StringType> 16507328 ns 16493800 ns 43 bytes_per_second=305.079Mi/s items_per_second=63.5739M/s output_size=927.352k page_index_size=35 BM_WriteListColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::StringType> 16575359 ns 16561311 ns 42 bytes_per_second=303.836Mi/s items_per_second=63.3148M/s output_size=927.377k page_index_size=55 ``` Performance results without this PR: ``` ------------------------------------------------------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------------------------------------------------------------------ BM_WritePrimitiveColumn<SizeStatisticsLevel::None, ::arrow::Int64Type> 8042576 ns 8037678 ns 87 bytes_per_second=1010.86Mi/s items_per_second=130.458M/s output_size=537.472k page_index_size=33 BM_WritePrimitiveColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::Int64Type> 9576627 ns 9571279 ns 73 bytes_per_second=848.894Mi/s items_per_second=109.554M/s output_size=537.488k page_index_size=33 BM_WritePrimitiveColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::Int64Type> 9570204 ns 9563595 ns 73 bytes_per_second=849.576Mi/s items_per_second=109.642M/s output_size=537.502k page_index_size=47 BM_WritePrimitiveColumn<SizeStatisticsLevel::None, ::arrow::StringType> 10165397 ns 10160868 ns 69 bytes_per_second=454.628Mi/s items_per_second=103.197M/s output_size=848.305k page_index_size=34 BM_WritePrimitiveColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::StringType> 11662568 ns 11657396 ns 60 bytes_per_second=396.265Mi/s items_per_second=89.9494M/s output_size=848.325k page_index_size=34 BM_WritePrimitiveColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::StringType> 11657135 ns 11653063 ns 60 bytes_per_second=396.412Mi/s items_per_second=89.9829M/s output_size=848.344k page_index_size=48 BM_WriteListColumn<SizeStatisticsLevel::None, ::arrow::Int64Type> 13182006 ns 13168704 ns 51 bytes_per_second=648.318Mi/s items_per_second=79.6264M/s output_size=617.464k page_index_size=34 BM_WriteListColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::Int64Type> 16438205 ns 16421762 ns 43 bytes_per_second=519.89Mi/s items_per_second=63.8528M/s output_size=617.486k page_index_size=34 BM_WriteListColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::Int64Type> 16424615 ns 16409032 ns 42 bytes_per_second=520.293Mi/s items_per_second=63.9024M/s output_size=617.506k page_index_size=54 BM_WriteListColumn<SizeStatisticsLevel::None, ::arrow::StringType> 15387808 ns 15373086 ns 46 bytes_per_second=327.32Mi/s items_per_second=68.2086M/s output_size=927.326k page_index_size=35 BM_WriteListColumn<SizeStatisticsLevel::ColumnChunk, ::arrow::StringType> 18319628 ns 18302938 ns 37 bytes_per_second=274.924Mi/s items_per_second=57.29M/s output_size=927.352k page_index_size=35 BM_WriteListColumn<SizeStatisticsLevel::PageAndColumnChunk, ::arrow::StringType> 18346665 ns 18329336 ns 37 bytes_per_second=274.528Mi/s items_per_second=57.2075M/s output_size=927.377k page_index_size=55 ``` ### Are these changes tested? Tested by existing tests, validated by existing benchmarks. ### Are there any user-facing changes? No. * GitHub Issue: #45201 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 0804ba6. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
Add a benchmark to know the performance of writing different size stats levels.
What changes are included in this PR?
Add a size_stats_benchmark for parquet.
Are these changes tested?
No
Are there any user-facing changes?
No