Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Under memory stress JVM builds an invalid native jobject, potentially causing segfault during GC #34

Open
Stewori opened this issue Sep 8, 2018 · 0 comments

Comments

@Stewori
Copy link
Owner

Stewori commented Sep 8, 2018

This issue is sourced out from #32, see #32 (comment) and #32 (comment).
Reproducible by this code:

import sys

sys.path.insert(0, '/data/workspace/linux/numpy/1.13.3')

import numpy as np
from java.lang import System
import time

for i in range(2000):
	print i
	np.array(range(7))

System.gc()
time.sleep(2)

It may or may not be possible to reproduce this somehow without NumPy or even without JyNI. If the issue triggers it results in the following log (excerpt), pointing to JyAttribute.delAttr:

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 2143 C1 org.python.core.JyAttribute.delAttr(Lorg/python/core/PyObject;B)V (169 bytes) @ 0x00007f19297499bb [0x00007f19297496c0+0x2fb]
J 2148 C1 JyNI.JyNI.clearNativeHandle(Lorg/python/core/PyObject;)V (37 bytes) @ 0x00007f192974d02c [0x00007f192974cd60+0x2cc]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x666a0b]
V  [libjvm.so+0x6886fe]
V  [libjvm.so+0x68b02f]
C  [libJyNI.so+0x46f7e]  JyNI_CleanUp_JyObject+0xfc
C  [libJyNI.so+0xf42fe]  meth_dealloc+0x240
C  [libJyNI.so+0xfc87b]  JyGC_clearNativeReferences+0xefa
C  [libJyNI-Loader.so+0x561c]  Java_JyNI_JyNI_JyGC_1clearNativeReferences+0x34
j  JyNI.JyNI.JyGC_clearNativeReferences([JJ)Z+0
j  JyNI.gc.JyWeakReferenceGC$GCReaperThread.run()V+251
v  ~StubRoutines::call_stub
V  [libjvm.so+0x666a0b]
V  [libjvm.so+0x663fe4]
V  [libjvm.so+0x6645c7]
V  [libjvm.so+0x6a78a4]
V  [libjvm.so+0xa15547]
V  [libjvm.so+0xa15a0c]
V  [libjvm.so+0x8bae12]
C  [libpthread.so.0+0x76ba]  start_thread+0xca

This is caused by some native jobject that should be representing a org.python.core.PyObject actually does not. Still, GetObjectRefType from JNI API reports the object as a valid reference of type JNIWeakGlobalRef. The object is not NULL and not null. Calling getClass on the object on Java side results in java.lang.object. Apparently, no Java-side exception is pending. If there was one, it got cleared.

The issue can be fixed in symptomatic sense by changing JyNI.clearNativeHandle to the following:

clearNativeHandle(PyObject object) {
		if (object == null) {
			System.err.println("JyNI-Warning: clearNativeHandle called with null!");
			return;
		}
		if (object instanceof PyCPeer)
			((PyCPeer) object).objectHandle = 0;
		else {
			if (!(object instanceof PyObject))
				System.err.println("JyNI-Warning: clearNativeHandle received Non-PyObject!");
			else
				JyAttribute.delAttr(object, JyAttribute.JYNI_HANDLE_ATTR);

However we should investigate under which circumstances the invalid jobject is created. I am somewhat confident that native JyNI code never creates a plain java.lang.object. Is it possible that the JVM creates faulty native jobjects under heavy memory stress or if it runs out of native references? (c.f. EnsureLocalCapacity in JNI API) This suspect is supported by the observation that the issue can be avoided by calling System.gc right after import numpy:

import numpy as np
System.gc()
time.sleep(2)

The JVM does not take natively used memory into account to decide when it's time to run GC. Maybe we should trigger GC explicitly in JyNI from time to time. E.g. after importing a native module or if the number of JyGCHeads exceeds some limit. A proper strategy needs to be figured out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant