Skip to content
This repository has been archived by the owner on May 14, 2024. It is now read-only.

comgr-objdump.cpp causes crash if loaded alongside mesa #52

Closed
xytovl opened this issue Dec 20, 2022 · 15 comments
Closed

comgr-objdump.cpp causes crash if loaded alongside mesa #52

xytovl opened this issue Dec 20, 2022 · 15 comments

Comments

@xytovl
Copy link

xytovl commented Dec 20, 2022

When using both libamd_comgr.so and mesa with libvulkan_radeon, application (blender-3.3 in my case) crashes with

mesa: CommandLine Error: Option 'h' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
Abandon (core dumped)

the following patch fixes this for me:

diff -ru ROCm-CompilerSupport-rocm-5.3.3.orig/lib/comgr/src/comgr-objdump.cpp ROCm-CompilerSupport-rocm-5.3.3/lib/comgr/src/comgr-objdump.cpp
--- ROCm-CompilerSupport-rocm-5.3.3.orig/lib/comgr/src/comgr-objdump.cpp	2022-07-28 17:06:14.000000000 +0200
+++ ROCm-CompilerSupport-rocm-5.3.3/lib/comgr/src/comgr-objdump.cpp	2022-12-20 20:03:53.357398486 +0100
@@ -175,9 +175,6 @@
 static cl::alias SectionHeadersShort("headers",
                                      cl::desc("Alias for --section-headers"),
                                      cl::aliasopt(SectionHeaders));
-static cl::alias SectionHeadersShorter("h",
-                                       cl::desc("Alias for --section-headers"),
-                                       cl::aliasopt(SectionHeaders));
 
 cl::list<std::string>
     FilterSections("section",

If I understand correctly, none of the command line options and aliases defined here are used, but this one alone fixes the issue.

@dbenedb
Copy link

dbenedb commented Jan 4, 2023

Here is Blender 3.4 bug report with more details about the same issue:

https://developer.blender.org/T102018

@cgmb
Copy link

cgmb commented Jan 5, 2023

I still don't understand why either mesa or comgr register these command line options. Does anyone know what their purpose is?

@xytovl
Copy link
Author

xytovl commented Jan 5, 2023

comgr-objdump seems to be a derivate from an old version of https://github.com/llvm/llvm-project/blob/main/llvm/tools/llvm-objdump/llvm-objdump.cpp, the original file is intended as a command-line utility and it makes sense to define command-line options.

The conflicting definition seems to be in https://github.com/llvm/llvm-project/blob/239a174d92dd2bd99ecb1308dc8937040895b04d/llvm/lib/Support/CommandLine.cpp#L2618
Where "help" and "h" are defined, I think this line should also not end up in llvm as a library but this is probably a larger change as it could probably break "help" for llvm command line tools.

@kzhuravl
Copy link
Collaborator

kzhuravl commented Jan 6, 2023

cc @lamb-j

@lamb-j
Copy link
Collaborator

lamb-j commented Jan 6, 2023

Ideally we could have two separate threads/components both manage their own options without clashing, but this is a known shortcoming of the way LLVM options are currently implemented. We have ambitions to work on this in the long term, but it requires big upstream LLVM changes.

In the short term, I think it may be safe to remove some alias options from comgr-objdump that have a higher probability of clashing with other libraries (like "-h" here). @kzhuravl Maybe we could remove all of the short/shorter options (-d, -p, -h, -s, etc.) and document that comgr-objdump requires full-length option names?

FYI these options can be set by a user via amd_comgr_action_info_set_option_list(...) before calling something like amd_comgr_do_action(AMD_COMGR_ACTION_DISASSEMBLE_RELOCATABLE_TO_SOURCE, ...)

@RadoslavL
Copy link

How did you install the patch? I tried to do the same thing, but got these error messages:

 * Applying user patches from /etc/portage/patches ...
 * Applying llvm-fix.patch ...
can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff -ru ROCm-CompilerSupport-rocm-5.3.3.orig/lib/comgr/src/comgr-objdump.cpp ROCm-CompilerSupport-rocm-5.3.3/lib/comgr/src/comgr-objdump.cpp
|--- ROCm-CompilerSupport-rocm-5.3.3.orig/lib/comgr/src/comgr-objdump.cpp	2022-07-28 17:06:14.000000000 +0200
|+++ ROCm-CompilerSupport-rocm-5.3.3/lib/comgr/src/comgr-objdump.cpp	2022-12-20 20:03:53.357398486 +0100
--------------------------
No file to patch.  Skipping patch.
1 out of 1 hunk ignored                                                                                                                                                                                      [ !! ]
 * ERROR: dev-libs/rocm-comgr-5.3.3::gentoo failed (prepare phase):
 *   patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch
 * 
 * Call stack:
 *               ebuild.sh, line  136:  Called src_prepare
 *             environment, line 2332:  Called cmake_src_prepare
 *             environment, line 1067:  Called default_src_prepare
 *      phase-functions.sh, line  948:  Called __eapi8_src_prepare
 *             environment, line  323:  Called eapply_user
 *             environment, line 1260:  Called eapply '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1223:  Called _eapply_patch '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1161:  Called __helpers_die 'patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *   isolated-functions.sh, line  112:  Called die
 * The specific snippet of code:
 *   		die "$@"
 * 
 * If you need support, post the output of `emerge --info '=dev-libs/rocm-comgr-5.3.3::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-libs/rocm-comgr-5.3.3::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'
 * S: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'

>>> Failed to emerge dev-libs/rocm-comgr-5.3.3, Log file:

>>>  '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/build.log'

 * Messages for package dev-libs/rocm-comgr-5.3.3:

 * ERROR: dev-libs/rocm-comgr-5.3.3::gentoo failed (prepare phase):
 *   patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch
 * 
 * Call stack:
 *               ebuild.sh, line  136:  Called src_prepare
 *             environment, line 2332:  Called cmake_src_prepare
 *             environment, line 1067:  Called default_src_prepare
 *      phase-functions.sh, line  948:  Called __eapi8_src_prepare
 *             environment, line  323:  Called eapply_user
 *             environment, line 1260:  Called eapply '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1223:  Called _eapply_patch '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1161:  Called __helpers_die 'patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *   isolated-functions.sh, line  112:  Called die
 * The specific snippet of code:
 *   		die "$@"
 * 
 * If you need support, post the output of `emerge --info '=dev-libs/rocm-comgr-5.3.3::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-libs/rocm-comgr-5.3.3::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'
 * S: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'

@xytovl
Copy link
Author

xytovl commented Jan 12, 2023

How did you install the patch? I tried to do the same thing, but got these error messages:

 * Applying user patches from /etc/portage/patches ...
 * Applying llvm-fix.patch ...
can't find file to patch at input line 4
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff -ru ROCm-CompilerSupport-rocm-5.3.3.orig/lib/comgr/src/comgr-objdump.cpp ROCm-CompilerSupport-rocm-5.3.3/lib/comgr/src/comgr-objdump.cpp
|--- ROCm-CompilerSupport-rocm-5.3.3.orig/lib/comgr/src/comgr-objdump.cpp	2022-07-28 17:06:14.000000000 +0200
|+++ ROCm-CompilerSupport-rocm-5.3.3/lib/comgr/src/comgr-objdump.cpp	2022-12-20 20:03:53.357398486 +0100
--------------------------
No file to patch.  Skipping patch.
1 out of 1 hunk ignored                                                                                                                                                                                      [ !! ]
 * ERROR: dev-libs/rocm-comgr-5.3.3::gentoo failed (prepare phase):
 *   patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch
 * 
 * Call stack:
 *               ebuild.sh, line  136:  Called src_prepare
 *             environment, line 2332:  Called cmake_src_prepare
 *             environment, line 1067:  Called default_src_prepare
 *      phase-functions.sh, line  948:  Called __eapi8_src_prepare
 *             environment, line  323:  Called eapply_user
 *             environment, line 1260:  Called eapply '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1223:  Called _eapply_patch '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1161:  Called __helpers_die 'patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *   isolated-functions.sh, line  112:  Called die
 * The specific snippet of code:
 *   		die "$@"
 * 
 * If you need support, post the output of `emerge --info '=dev-libs/rocm-comgr-5.3.3::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-libs/rocm-comgr-5.3.3::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'
 * S: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'

>>> Failed to emerge dev-libs/rocm-comgr-5.3.3, Log file:

>>>  '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/build.log'

 * Messages for package dev-libs/rocm-comgr-5.3.3:

 * ERROR: dev-libs/rocm-comgr-5.3.3::gentoo failed (prepare phase):
 *   patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch
 * 
 * Call stack:
 *               ebuild.sh, line  136:  Called src_prepare
 *             environment, line 2332:  Called cmake_src_prepare
 *             environment, line 1067:  Called default_src_prepare
 *      phase-functions.sh, line  948:  Called __eapi8_src_prepare
 *             environment, line  323:  Called eapply_user
 *             environment, line 1260:  Called eapply '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1223:  Called _eapply_patch '/etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *             environment, line 1161:  Called __helpers_die 'patch -p1  failed with /etc/portage/patches/dev-libs/rocm-comgr/llvm-fix.patch'
 *   isolated-functions.sh, line  112:  Called die
 * The specific snippet of code:
 *   		die "$@"
 * 
 * If you need support, post the output of `emerge --info '=dev-libs/rocm-comgr-5.3.3::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=dev-libs/rocm-comgr-5.3.3::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/temp/environment'.
 * Working directory: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'
 * S: '/var/tmp/portage/dev-libs/rocm-comgr-5.3.3/work/ROCm-CompilerSupport-rocm-5.3.3/lib/comgr'

It looks like on Gentoo the archive does not have the same layout. The patch should be

diff -ru ROCm-CompilerSupport-rocm-5.3.3.orig/src/comgr-objdump.cpp ROCm-CompilerSupport-rocm-5.3.3/src/comgr-objdump.cpp
--- ROCm-CompilerSupport-rocm-5.3.3.orig/src/comgr-objdump.cpp	2022-07-28 17:06:14.000000000 +0200
+++ ROCm-CompilerSupport-rocm-5.3.3/src/comgr-objdump.cpp	2022-12-20 20:03:53.357398486 +0100
@@ -175,9 +175,6 @@
 static cl::alias SectionHeadersShort("headers",
                                      cl::desc("Alias for --section-headers"),
                                      cl::aliasopt(SectionHeaders));
-static cl::alias SectionHeadersShorter("h",
-                                       cl::desc("Alias for --section-headers"),
-                                       cl::aliasopt(SectionHeaders));
 
 cl::list<std::string>
     FilterSections("section",

The difference is only in the file path (lib/comgr is not present for Gentoo).

@RadoslavL
Copy link

RadoslavL commented Jan 12, 2023

@xytovl Thanks for the help!

@labre
Copy link

labre commented Jan 12, 2023

but it requires big upstream LLVM changes.

Which changes might that be and are they addressed by the following reviews?
https://reviews.llvm.org/D129129
https://reviews.llvm.org/D129134

I happen to have approached this issue from the LLVM side with
https://reviews.llvm.org/D141019
eventually leading me back to RocM after I threw gdb at it. My “solution” was just a quick hack however to test, whether that error can be ignored in this case.

@lamb-j
Copy link
Collaborator

lamb-j commented Jan 17, 2023

@xytovl I just committed 2d05f9e

Thanks for tracking this down!

@labre Nicolai's patches and the removal of ManagedStatic definitely addresses some of the issues. I haven't sat down with them yet to see what still needs to happen. Besides examples like this where we have Comgr and Mesa running at the same time, we also would like the option of having multiple instances of Comgr (and thus multiple LLVMs) running without hitting issues with the option parser.

I'll let you know if I figure anything out on this front, and I'd love to be kept in the loop on your progress!

@labre
Copy link

labre commented Jan 18, 2023

I'll let you know if I figure anything out on this front, and I'd love to be kept in the loop on your progress!

That would be taking false credit for others work. I’m just a Gentoo user, who simultaneously tried to track down this in LLVM instead of RocM. I just tested a quick hack in LLVM and then opened a RFC on this to increase awareness for the CommandLine parser issues in practice.

However I eventually was hinted towards this bug by gdb and the ongoing progress in LLVM by another discourse user, so basically I was a tad too late for investigating this one. ;)

Since @nhaehnle is the author of the Revisions, he is the better contact for staying in the loop. Thanks for merging the fix. :)

@lamb-j lamb-j closed this as completed Jan 23, 2023
@LordOfDragons
Copy link

I'll let you know if I figure anything out on this front, and I'd love to be kept in the loop on your progress!

That would be taking false credit for others work. I’m just a Gentoo user, who simultaneously tried to track down this in LLVM instead of RocM. I just tested a quick hack in LLVM and then opened a RFC on this to increase awareness for the CommandLine parser issues in practice.

However I eventually was hinted towards this bug by gdb and the ongoing progress in LLVM by another discourse user, so basically I was a tad too late for investigating this one. ;)

Since @nhaehnle is the author of the Revisions, he is the better contact for staying in the loop. Thanks for merging the fix. :)

You said you come from Gentoo. Have you been able to test this somehow? I'm hitting a similar problem with Blender crashing due to ROCm getting in conflict with MESA and the ROCm version 5.3.3 seems not to be new enough.

@labre
Copy link

labre commented Feb 20, 2023

You said you come from Gentoo. Have you been able to test this somehow? I'm hitting a similar problem with Blender crashing due to ROCm getting in conflict with MESA and the ROCm version 5.3.3 seems not to be new enough.

Well, I used the line deletions from the commit as a user patch and forgot about it. I updated it to be the output of git format-patch and adapted the path like outlined above:

The difference is only in the file path (lib/comgr is not present for Gentoo).

You can use the user patch attached with this post and put it into /etc/portage/patches/dev-libs/rocm-comgr as follows and as root, but please compare it with the commit diff. Do not ever trust a random patch offered to you on the Internet and double check the link targets of this post:

# Github did not like the patch extension, albeit claiming to do so
mv 0001-Remove-h-option-from-comgr-objdump.{md,patch}
mkdir -p /etc/portage/patches/dev-libs/rocm-comgr
cp 0001-Remove-h-option-from-comgr-objdump.patch /etc/portage/patches/dev-libs/rocm-comgr
# Remove rocm-opencl-runtime and rebuild runtime dependencies
emerge -C rocm-opencl-runtime
emerge --depclean
emerge rocm-opencl-runtime

0001-Remove-h-option-from-comgr-objdump.md

Let me know, whether this works for you. If you have any questions feel free to ask. Otherwise you’re welcome. :) Oh, and since you seem to not know the user patches feature, I suppose you’re relatively new to the Gentoo world. In that case, welcome to the community! :)

@LordOfDragons
Copy link

LordOfDragons commented Feb 20, 2023

Actually I use Gentoo since nearly 20 years but I never got in contact with user patches so far.

@wolfiestyle
Copy link

wolfiestyle commented Feb 21, 2023

This is the modified version of the patch i use myself on Gentoo with the path changes done. Drop in /etc/portage/patches/dev-libs/rocm-comgr-5.3.3-r1/ (you will need to mkdir -p this path) and create a .patch file with this content

diff -ru comgr.orig/src/comgr-objdump.cpp comgr/src/comgr-objdump.cpp
--- comgr.orig/src/comgr-objdump.cpp	2022-07-28 17:06:14.000000000 +0200
+++ comgr/src/comgr-objdump.cpp	2022-12-20 20:03:53.357398486 +0100
@@ -175,9 +175,6 @@
 static cl::alias SectionHeadersShort("headers",
                                      cl::desc("Alias for --section-headers"),
                                      cl::aliasopt(SectionHeaders));
-static cl::alias SectionHeadersShorter("h",
-                                       cl::desc("Alias for --section-headers"),
-                                       cl::aliasopt(SectionHeaders));
 
 cl::list<std::string>
     FilterSections("section",

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants