-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's the difference of this compiler and HexagonSDK's? #26
Comments
The hexagon compiler in the Compiler Explorer is the same one as produced by the scripts in this repo. But it's different from the one in the Hexagon SDK. It's different in several ways, there's different passes provided by the compiler in the SDK, for example. But it also might be using a different baseline LLVM/Clang version. For example, 8.7.06 is based on llvm+clang 15.0.0:
But ultimately they're similar in that they both produce executable code for Hexagon DSPs. |
@androm3da Thank you for the reply.
OK, so I can use Compiler Explorer for generate purpose assembly analysis, such as counting how many intruction packets as estimation of the program, is that correct? And I also wonder if they use same stack size? I find it about 14000 bytes in a unittest program of v66 cDSP, from the HexagonSDK 5.5.0.1's, which is far less than x86-64 Linux (~8192 KB). This repo's hexagon-clang use musl libc, and it seems mucl lib use a smaller stack size. |
You should expect the codegen performance of these two compilers to be different - at least with the current releases of each. This would mean that if you want to count the number of packets emitted for a given C program, you should expect differences in this count between the two.
It's important to note -- there are two targets usable with the toolchain built in this repo: the baremetal one Your question regarding stack size - are you asking about the typical size of an individual frame, or the size of the entire stack allocation? Linux programs would grow their stack dynamically. I don't recall the stack allocation size / behavior for QuRT but I might be able to look up this information. Deciding when to use the stack and how much of the stack to use - that is an aspect of the compiler's codegen performance and that would differ among the Hexagon SDK and this toolchain's compiler. |
OK, the two compilers are different and use one compiler for
Yes, I am asking the size of the entire stack allocation. The frame chain is like: // cv::AutoBuffer is part of OpenCV
// whole class: https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core/utility.hpp#L71-L151
// default fixed_size : https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core/utility.hpp#L100
// stack allocated buffer: https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core/utility.hpp#L150
/*
template<typename _Tp, size_t fixed_size = 1024/sizeof(_Tp)+8> class AutoBuffer
{
public:
...
_Tp buf[(fixed_size > 0) ? fixed_size : 1];
};
*/
void gemm()
{
cv::AutoBuffer buf1;
...
gemm_internal();
...
}
void gemm_internal()
{
cv::AutoBuffer buf2;
...
}
int main()
{
float a[200*200];
float b[200];
randomize(a, b);
float c[200];
gemm(a, b, c, 200, 200, 200, 1);
} As illustrated, both |
There is also a difference for integer types between the two compilers. I use the following snippet for compile-time testing, and got different output: #include <stdint.h>
#include <stdio.h>
#include <type_traits>
template<typename T> static inline T saturate_cast(uint32_t v) { return T(v); }
template<typename T> static inline T saturate_cast(int32_t v) { return T(v); }
int main()
{
//int a = 233;
//saturate_cast<uint8_t>(a);
static_assert(std::is_same<int, int32_t>::value, "int is not int32_t");
static_assert(std::is_same<int, long>::value, "int is not long");
static_assert(!std::is_same<int32_t, long>::value, "int32_t is same as long");
return 0;
} Output from HexagonSDK 5.5.0.1's hexagon-clang:
Output from Compiler Explorer's hexagon-clang 16.0.5: (https://godbolt.org/z/v3nYrqnhq)
|
Okay, I see -- so you're trying to do some static analysis of the maximum stack depth? To compare with the OS limitation(s) on stack size?
Incidentally I had looked into this recently. Some differences between the Hexagon SDK compiler and this open source toolchain are expected. But this one may not be - I'll do a bit of digging. |
Hi, QUIC toolchain maintainers:
I installed Hexagon SDK 5.5.0.1 (via QPM) which contains hexagon-clang 8.7.06:
zz@localhost:~/soft/Qualcomm/Hexagon_SDK/5.5.0.1$ ./tools/HEXAGON_Tools/8.7.06/Tools/bin/hexagon-clang --version QuIC LLVM Hexagon Clang version 8.7.06 Target: hexagon Thread model: posix InstalledDir: /home/zz/soft/Qualcomm/Hexagon_SDK/5.5.0.1/./tools/HEXAGON_Tools/8.7.06/Tools/bin
I would like to analyze some performance issue of C/C++ code and it's disassembly. I notice there is a hexagon-clang compiler in Compiler Explorer (https://godbolt.org/):
What I am confused about is, are they the same or similar compiler?
The text was updated successfully, but these errors were encountered: