jgroups.stack.Protocol.down, the application needs to be restarted because it is experiencing thread starvation. #13540
mihazoubek
started this conversation in
General
Replies: 1 comment
-
Do you have thread dumps from all the nodes in your cluster? The thread in your post is flushing the data to the network; no issue with it. The only known scenario of thread starvation is when all the JGroups threads are blocked by flow control and there is none to handle requests. The thread dumps should help identify if that is the case. The following tunning may help reduce it
Those options are valid for Infinispan 15 and for the default stacks. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi
we have an application that uses Infinispan and it runs in a a Docker container. We are experiencing a problem where the application needs to be restarted approximately every 14 days due to thread starvation (before it was running for 5 months without any issues). In the thread dump, I noticed that this issue is related to Infinispan (JGroups).
What could be causing this? We have already upgraded to the latest version. Could it be a CPU resource limitation within the Docker container? Previously, I had the CPU limit set to 4 cores, but I have now increased it to 6 cores to see if that helps. The application is running in host network mode.
If more information is needed please let me know.
"jgroups-3659,plyp-be3-53137" #26962 prio=5 os_prio=0 cpu=927.60ms elapsed=32.13s tid=0x000074047c108800 nid=0x55acc runnable [0x00007405a210c000] java.lang.Thread.State: RUNNABLE at java.net.SocketOutputStream.socketWrite0([email protected]/Native Method) at java.net.SocketOutputStream.socketWrite([email protected]/Unknown Source) at java.net.SocketOutputStream.write([email protected]/Unknown Source) at java.io.BufferedOutputStream.flushBuffer([email protected]/Unknown Source) at java.io.BufferedOutputStream.write([email protected]/Unknown Source) - locked <0x00000004e38f3bd0> (a java.io.BufferedOutputStream) at java.io.DataOutputStream.write([email protected]/Unknown Source) - locked <0x00000004e38f3ba8> (a java.io.DataOutputStream) at org.jgroups.blocks.cs.TcpConnection.doSend(TcpConnection.java:166) at org.jgroups.blocks.cs.TcpConnection.send(TcpConnection.java:136) at org.jgroups.blocks.cs.BaseServer.send(BaseServer.java:209) at org.jgroups.protocols.TCP.send(TCP.java:91) at org.jgroups.protocols.BasicTCP.sendUnicast(BasicTCP.java:146) at org.jgroups.protocols.TP.sendToSingleMember(TP.java:1650) at org.jgroups.protocols.TP.doSend(TP.java:1638) at org.jgroups.protocols.NoBundler.sendSingleMessage(NoBundler.java:38) at org.jgroups.protocols.NoBundler.send(NoBundler.java:30) at org.jgroups.protocols.TP.send(TP.java:1626) at org.jgroups.protocols.TP._send(TP.java:1359) at org.jgroups.protocols.TP.down(TP.java:1268) at org.jgroups.stack.Protocol.down(Protocol.java:287) at org.jgroups.stack.Protocol.down(Protocol.java:287) at org.jgroups.stack.Protocol.down(Protocol.java:287) at org.jgroups.protocols.FailureDetection.down(FailureDetection.java:171) at org.jgroups.stack.Protocol.down(Protocol.java:287) at org.jgroups.protocols.pbcast.NAKACK2.down(NAKACK2.java:567) at org.jgroups.protocols.UNICAST3.down(UNICAST3.java:656) at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:298) at org.jgroups.stack.Protocol.down(Protocol.java:287) at org.jgroups.protocols.UFC_NB.lambda$new$0(UFC_NB.java:28) at org.jgroups.protocols.UFC_NB$$Lambda$744/0x0000000840c51c40.accept(Unknown Source) at java.util.ArrayList.forEach([email protected]/Unknown Source) at org.jgroups.util.NonBlockingCredit.increment(NonBlockingCredit.java:90) at org.jgroups.protocols.UFC.handleCredit(UFC.java:163) at org.jgroups.protocols.FlowControl.handleUpEvent(FlowControl.java:380) at org.jgroups.protocols.FlowControl.up(FlowControl.java:358) at org.jgroups.protocols.pbcast.GMS.up(GMS.java:876) at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:254) at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1055) at org.jgroups.protocols.UNICAST3.addMessage(UNICAST3.java:778) at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:759) at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:412) at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:598) at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:132) at org.jgroups.protocols.FailureDetection.up(FailureDetection.java:186) at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:254) at org.jgroups.protocols.MERGE3.up(MERGE3.java:281) at org.jgroups.protocols.Discovery.up(Discovery.java:300) at org.jgroups.protocols.TP.passMessageUp(TP.java:1410) at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:98) at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/Unknown Source) at java.lang.Thread.run([email protected]/Unknown Source)
Beta Was this translation helpful? Give feedback.
All reactions