diff --git a/documents/benchmark/hipobench.pdf b/documents/benchmark/hipobench.pdf index 46dac07..79c38c8 100644 Binary files a/documents/benchmark/hipobench.pdf and b/documents/benchmark/hipobench.pdf differ diff --git a/documents/benchmark/hipobench.tex b/documents/benchmark/hipobench.tex index 1ab81bf..76b4020 100644 --- a/documents/benchmark/hipobench.tex +++ b/documents/benchmark/hipobench.tex @@ -41,9 +41,11 @@ \title{HIPO Data Format Performance Compared to ROOT} \author[1]{Gagik Gavalian} +\author[2]{Derek Glazier} %\affil[1]{Jefferson Lab, Newport News, VA, USA} % \affiliation[1]{organization={CRTC, Department of Computer Science, Old Dominion University}, city={Norfolk, VA}, country={USA}} \address[1]{Jefferson Lab, Newport News, VA, USA} +\address[2]{University of Glasgow, Glasgow, UK} \begin{document} %\begin{titlepage} @@ -200,7 +202,7 @@ \section{Reading Benchmarks} The results show that the MacBook M1 has the best writing speeds due to its very fast SSD. However, in all of the tests HIPO C++ and HIPO Java writers outperform ROOT writers significantly. The surprise of this benchmark is that the writing speeds of ROOT files on AMD and Intel Xeon architectures are not very different, as it was for reading speeds. -Both data formats use same LZ4 compression algorithms, and in general writing speed will depend on level of compression requested from the library. For both samples we used highest level of compression and we end up similar file size for both data formats, ROOT file being slightly lagrger 2,119 MB vs 2,033 MB in HiPO data format. +Both data formats use the same LZ4 compression algorithms, and in general writing speed will depend on the level of compression requested from the library. For both samples, we used highest level of compression and we ended up with similar file sizes for both data formats, the ROOT file being slightly larger 2,119 MB vs 2,033 MB in the HiPO data format. %\section{CLAS12 ROOT Tree Implementation} %In the previous sections, the performance of the ROOT data format is shown for the new ROOT tree reader. The legacy code used in @@ -225,14 +227,26 @@ \section{Reading Benchmarks} \section{Discussion} - In this article, we tested the analysis performance when reading data from the ROOT tree and HIPO data format. - It was shown that HIPO outperforms ROOT on every tested platform and due to ROOT Tree architecture dependence - on Intel architecture, the difference in performance is almost double. The writing tests show a significant (5-7 times) difference in the file writing times. - Moreover, analysis done in Java using the Java implementation of HIPO data format outperformed analysis done using ROOT data format on all platforms - and in all ROOT tree implementations. Writing files in Java is also significantly faster than writing ROOT trees. + In this article, we compared the performance of the ROOT tree and HIPO data format. In our study, we read the entire bank (all branches in ROOT) from both files and + used all the variables to perform some calculations. In this scenario, the HIPO outperforms the ROOT tree by about $30\%$ in reading speed. The Java implementation + of HIPO also outperforms the ROOT tree in reading speed. The HIPO data format is also $\sim 8$ times faster in writing speed in both implementations of C++ and Java. At the early stages of CLAS12 software development, the data was converted to ROOT for final analysis, which leads to significant empty computation cycles, and leads to data structures that are slower to process. With the development of CLAS12ROOT package (in C++) the HIPO files can be easily read into the ROOT environment and analyzed using data analysis and visualization tools available in ROOT. + +It is important to note that for data visualization when only one variable is read from the tuple-like structure and plotted, the ROOT tree outperforms the HIPO +tree. The HIPO data format is primarily designed to read a collection of data and do analysis and for that purpose its performance is adequate to be used +for CLAS12 experimental needs in all stages of data processing, starting from data acquisition, reconstruction code, and final physics analysis. + +% tested the analysis performance when reading data from the ROOT tree and HIPO data format. + + %It was shown that HIPO outperforms ROOT on every tested platform and due to ROOT Tree architecture dependence + %on Intel architecture, the difference in performance is almost double. The writing tests show a significant (5-7 times) difference in the file writing times. + %Moreover, analysis done in Java using the Java implementation of HIPO data format outperformed analysis done using ROOT data format on all platforms + %and in all ROOT tree implementations. Writing files in Java is also significantly faster than writing ROOT trees. + %At the early stages of CLAS12 software development, the data was converted to ROOT for final analysis, which leads to significant + %empty computation cycles, and leads to data structures that are slower to process. With the development of CLAS12ROOT package (in C++) + %the HIPO files can be easily read into the ROOT environment and analyzed using data analysis and visualization tools available in ROOT. \newpage