Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"[./profiler.h:111] Thread state not found" when trying to profile openalpr #54

Open
petterreinholdtsen opened this issue Jul 11, 2016 · 7 comments
Labels
bug This is a bug

Comments

@petterreinholdtsen
Copy link
Contributor

To test coz-profiler, I decided to try to profile the openalpr command line tool, available from http://www.openalpr.com/ . It is a image processing system written in C++, used for extracting car license plates from images and video.

I modified the code like this to try benchmark how quickly it could process images.

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 95d4e18..a970c39 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -110,7 +110,7 @@ IF (WIN32)
 ELSE()

        SET (Extra_LIBS
-               pthread
+               pthread dl
        )

 ENDIF()
diff --git a/src/main.cpp b/src/main.cpp
index 190072f..f32d923 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -22,6 +22,7 @@
 #include <iostream>
 #include <iterator>
 #include <algorithm>
+#include <coz.h>

 #include "opencv2/highgui/highgui.hpp"
 #include "opencv2/imgproc/imgproc.hpp"
@@ -360,7 +361,7 @@ bool detectandshow( Alpr* alpr, cv::Mat frame, std::string region, bool writeJso
     }
   }

-
+  COZ_PROGRESS;

   return results.plates.size() > 0;
 }

After building, I ran it with a random MP4 vidoe file I had lying around after generating a simple openalpr.conf file:

coz run --- ./alpr --config ../openalpr.conf ~/tmp/input.mp4

Instead of getting a useful profiling run done, I got the error message "[./profiler.h:111] Thread state not found" instead. Am I using coz the wrong way?

Looking at the code, I suspect the problem is that openalpr is bootstraping its threads the wrong way, but I am unable to pinpoint exactly what is wrong. Also, I suspect coz should handle any code, and thus decided to report the problem as an issue.

@ccurtsinger
Copy link
Member

You may be right about coz missing thread creation, but the one other possibility is that openalpr is forking a new process. Could you verify that alpr is running in entirely in a single process?

One other possibility is that something in a module-level constructor is accessing thread states before coz is fully initialized. I will look into this.

@petterreinholdtsen
Copy link
Contributor Author

[Charlie Curtsinger]

You may be right about coz missing thread creation, but the one other
possibility is that openalpr is forking a new process. Could you
verify that alpr is running in entirely in a single process?

Not quite sure how to verify it, but running 'ltrace alpr ... 2>&1 | grep
fork' did not find any fork()s.

One other possibility is that something in a module-level constructor
is accessing thread states before coz is fully initialized. I will
look into this.

Great. You can find the openalpr code using
<URL: https://github.com/openalpr/openalpr.git >.

Happy hacking
Petter Reinholdtsen

@petterreinholdtsen
Copy link
Contributor Author

I investigated some more, and suspect the issue at hand is that threads are created before main() is called, in shared library initializers. Here is the gdb backtrace for the first call to pthread_create():

(gdb) bt
#0  __pthread_create_2_1 (newthread=0x7fffe6666538, attr=0x0, start_routine=0x7fffe645df50, 
    arg=0x0) at pthread_create.c:461
#1  0x00007fffe645e16f in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#2  0x00007fffe645c3c5 in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#3  0x00007fffe6454f5f in libusb_init () from /lib/x86_64-linux-gnu/libusb-1.0.so.0
#4  0x00007fffecb3fa4c in ?? () from /usr/lib/x86_64-linux-gnu/libdc1394.so.22
#5  0x00007fffecb2e015 in dc1394_new () from /usr/lib/x86_64-linux-gnu/libdc1394.so.22
#6  0x00007ffff57b51d9 in CvDC1394::CvDC1394() ()
   from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#7  0x00007ffff579fc40 in ?? () from /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4
#8  0x00007ffff7dea9ba in call_init (l=<optimized out>, argc=argc@entry=2, 
    argv=argv@entry=0x7fffffffdf48, env=env@entry=0x7fffffffdf60) at dl-init.c:78
#9  0x00007ffff7deaaa3 in call_init (env=0x7fffffffdf60, argv=0x7fffffffdf48, argc=2, 
    l=<optimized out>) at dl-init.c:36
#10 _dl_init (main_map=0x7ffff7ffe1a8, argc=2, argv=0x7fffffffdf48, env=0x7fffffffdf60)
    at dl-init.c:126
#11 0x00007ffff7ddd1ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#12 0x0000000000000002 in ?? ()
#13 0x00007fffffffe289 in ?? ()
#14 0x00007fffffffe297 in ?? ()
#15 0x0000000000000000 in ?? ()
(gdb)

I guess coz need to hook into the program startup code a bit earlier than wrapped_main() is doing it.

@petterreinholdtsen
Copy link
Contributor Author

I was able to avoid the crash using the following patch:

diff --git a/libcoz/profiler.h b/libcoz/profiler.h
index 515d998..032e2ea 100644
--- a/libcoz/profiler.h
+++ b/libcoz/profiler.h
@@ -108,6 +108,8 @@ public:
     thread_start_arg* new_arg;

     thread_state* state = get_thread_state();
+    if (NULL == state)
+      state = add_thread();
     REQUIRE(state) << "Thread state not found";

     // Allocate a struct to pass as an argument to the new thread

The idea is to not depend on wrapped_main() being called to set up the state pointer. I suspect it is not a safe solution, but just wanted to let you know that it got rid of the crash. Unfortunately I did not get much information in the profile.coz file:

startup time=1468705859216684133
runtime time=223636086700

I suspect this is because I patched the wrong function in openalpr.

@petterreinholdtsen
Copy link
Contributor Author

Hi. Is there more debugging information I can provide to help you with this?

@ccurtsinger
Copy link
Member

I'm not sure there's a good way to fix this issue at the moment. My attempts to initialize the profiler in the replaced __libc_start_main lead to segfaults (probably because static initializers have not yet run), but deferring initialization until after static initializers run means we'll get the same error you're seeing. We could silently ignore missing thread state, but that means the profiler wouldn't be collecting samples for the thread created before main runs.

I'll have to put this issue on hold for a bit.

@ccurtsinger
Copy link
Member

Implementing #60 should fix this issue, although this feature is in very early stages.

petterreinholdtsen added a commit to petterreinholdtsen/coz that referenced this issue Aug 14, 2016
…nitialized.

Make sure to initialize coz if pthread_create() is called before main().

Fixes issue plasma-umass#54
petterreinholdtsen added a commit to petterreinholdtsen/coz that referenced this issue Aug 14, 2016
Make sure to initialize coz if pthread_create() is called before main().
This happend with openalpr when a shared library initialized a thread
in the library initialization code.

Fixes issue plasma-umass#54
@ccurtsinger ccurtsinger added the bug This is a bug label Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This is a bug
Projects
None yet
Development

No branches or pull requests

2 participants