Skip to content

Commit

Permalink
modified the benchmark writeup
Browse files Browse the repository at this point in the history
  • Loading branch information
gavalian committed Jan 15, 2024
1 parent 9122a65 commit fe6cf91
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 6 deletions.
Binary file modified documents/benchmark/hipobench.pdf
Binary file not shown.
26 changes: 20 additions & 6 deletions documents/benchmark/hipobench.tex
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,11 @@

\title{HIPO Data Format Performance Compared to ROOT}
\author[1]{Gagik Gavalian}
\author[2]{Derek Glazier}
%\affil[1]{Jefferson Lab, Newport News, VA, USA}
% \affiliation[1]{organization={CRTC, Department of Computer Science, Old Dominion University}, city={Norfolk, VA}, country={USA}}
\address[1]{Jefferson Lab, Newport News, VA, USA}
\address[2]{University of Glasgow, Glasgow, UK}

\begin{document}
%\begin{titlepage}
Expand Down Expand Up @@ -200,7 +202,7 @@ \section{Reading Benchmarks}
The results show that the MacBook M1 has the best writing speeds due to its very fast SSD. However, in all of the tests
HIPO C++ and HIPO Java writers outperform ROOT writers significantly. The surprise of this benchmark is that the writing
speeds of ROOT files on AMD and Intel Xeon architectures are not very different, as it was for reading speeds.
Both data formats use same LZ4 compression algorithms, and in general writing speed will depend on level of compression requested from the library. For both samples we used highest level of compression and we end up similar file size for both data formats, ROOT file being slightly lagrger 2,119 MB vs 2,033 MB in HiPO data format.
Both data formats use the same LZ4 compression algorithms, and in general writing speed will depend on the level of compression requested from the library. For both samples, we used highest level of compression and we ended up with similar file sizes for both data formats, the ROOT file being slightly larger 2,119 MB vs 2,033 MB in the HiPO data format.

%\section{CLAS12 ROOT Tree Implementation}
%In the previous sections, the performance of the ROOT data format is shown for the new ROOT tree reader. The legacy code used in
Expand All @@ -225,14 +227,26 @@ \section{Reading Benchmarks}

\section{Discussion}

In this article, we tested the analysis performance when reading data from the ROOT tree and HIPO data format.
It was shown that HIPO outperforms ROOT on every tested platform and due to ROOT Tree architecture dependence
on Intel architecture, the difference in performance is almost double. The writing tests show a significant (5-7 times) difference in the file writing times.
Moreover, analysis done in Java using the Java implementation of HIPO data format outperformed analysis done using ROOT data format on all platforms
and in all ROOT tree implementations. Writing files in Java is also significantly faster than writing ROOT trees.
In this article, we compared the performance of the ROOT tree and HIPO data format. In our study, we read the entire bank (all branches in ROOT) from both files and
used all the variables to perform some calculations. In this scenario, the HIPO outperforms the ROOT tree by about $30\%$ in reading speed. The Java implementation
of HIPO also outperforms the ROOT tree in reading speed. The HIPO data format is also $\sim 8$ times faster in writing speed in both implementations of C++ and Java.
At the early stages of CLAS12 software development, the data was converted to ROOT for final analysis, which leads to significant
empty computation cycles, and leads to data structures that are slower to process. With the development of CLAS12ROOT package (in C++)
the HIPO files can be easily read into the ROOT environment and analyzed using data analysis and visualization tools available in ROOT.

It is important to note that for data visualization when only one variable is read from the tuple-like structure and plotted, the ROOT tree outperforms the HIPO
tree. The HIPO data format is primarily designed to read a collection of data and do analysis and for that purpose its performance is adequate to be used
for CLAS12 experimental needs in all stages of data processing, starting from data acquisition, reconstruction code, and final physics analysis.

% tested the analysis performance when reading data from the ROOT tree and HIPO data format.

%It was shown that HIPO outperforms ROOT on every tested platform and due to ROOT Tree architecture dependence
%on Intel architecture, the difference in performance is almost double. The writing tests show a significant (5-7 times) difference in the file writing times.
%Moreover, analysis done in Java using the Java implementation of HIPO data format outperformed analysis done using ROOT data format on all platforms
%and in all ROOT tree implementations. Writing files in Java is also significantly faster than writing ROOT trees.
%At the early stages of CLAS12 software development, the data was converted to ROOT for final analysis, which leads to significant
%empty computation cycles, and leads to data structures that are slower to process. With the development of CLAS12ROOT package (in C++)
%the HIPO files can be easily read into the ROOT environment and analyzed using data analysis and visualization tools available in ROOT.


\newpage
Expand Down

0 comments on commit fe6cf91

Please sign in to comment.