Again, THIS is why I blog.
One of my colleagues saw issues with IBM Business Monitor 8.5.6 on a Red Hat Enterprise Linux 6.6 VM, including core dumps and OutOfMemory exceptions.
Given that I'd created the base VM, and also installed all of the IBM middleware components ( this is for an enablement event that we're co-delivering next week ), I wanted to get to the bottom of it.
I took his build notes, and went through the same process.
And I saw much the same thing, specifically: -
[4/14/15 20:06:37:172 BST] 00000001 ContainerHelp E WSVR0102E: An error occurred stopping, com.ibm.ws.xs.httpsession.component.SessionListenerComponentImpl@907b28b4
java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11
java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11
[14/04/15 20:30:24:939 BST] 00000001 CommandMgr E ADMF0008E: Command Framework failed to initialize or cannot create CommandMgr in server mode. Root cause is java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11
in the AppClusterMember's SystemOut.log and: -
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2015/04/14 20:29:55 - please wait.
JVMDUMP032I JVM requested System dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/core.20150414.202955.3420.0001.dmp' in response to an event
JVMDUMP012E Error in System dump: insufficient system resources to generate dump, errno=11 "Resource temporarily unavailable"
JVMDUMP032I JVM requested Heap dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/heapdump.20150414.202955.3420.0002.phd' in response to an event
JVMDUMP010I Heap dump written to /opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/heapdump.20150414.202955.3420.0002.phd
JVMDUMP032I JVM requested Java dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/javacore.20150414.202955.3420.0003.txt' in response to an event
JVMDUMP010I Java dump written to /opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/javacore.20150414.202955.3420.0003.txt
JVMDUMP032I JVM requested Snap dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/Snap.20150414.202955.3420.0004.trc' in response to an event
UTE001: Error starting trace thread for "Snap Dump Thread": -1
JVMDUMP010I Snap dump written to /opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/Snap.20150414.202955.3420.0004.trc
JVMDUMP032I JVM requested System dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/core.20150414.202955.3420.0001.dmp' in response to an event
JVMDUMP012E Error in System dump: insufficient system resources to generate dump, errno=11 "Resource temporarily unavailable"
JVMDUMP032I JVM requested Heap dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/heapdump.20150414.202955.3420.0002.phd' in response to an event
JVMDUMP010I Heap dump written to /opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/heapdump.20150414.202955.3420.0002.phd
JVMDUMP032I JVM requested Java dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/javacore.20150414.202955.3420.0003.txt' in response to an event
JVMDUMP010I Java dump written to /opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/javacore.20150414.202955.3420.0003.txt
JVMDUMP032I JVM requested Snap dump using '/opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/Snap.20150414.202955.3420.0004.trc' in response to an event
UTE001: Error starting trace thread for "Snap Dump Thread": -1
JVMDUMP010I Snap dump written to /opt/IBM/WebSphere/AppServer/profiles/BAMCell1AppSrv01/Snap.20150414.202955.3420.0004.trc
in the Node Agent's native_stderr.log.
Interestingly, I also saw this: -
tail -f SystemOut.log
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
-bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
-bash: fork: Resource temporarily unavailable
which immediately reminded me of not one BUT two blog posts from last year: -
at which point I thought …. ah, yes, about those ulimits :-)
I referred back to an older set of notes, and found this: -
Need to ensure ulimits are set: -
Increase: -
open files (-n) 10240
max user processes (-u) 1024
to: -
open files (-n) 65536
max user processes (-u) 16384
Increase: -
open files (-n) 10240
max user processes (-u) 1024
to: -
open files (-n) 65536
max user processes (-u) 16384
as follows: -
Add: -
# - nofile - max number of open files
wasadmin soft nofile 65536
wasadmin hard nofile 65536
# - nproc - max number of processes
wasadmin soft nproc 16384
wasadmin hard nproc 16384
to: -
/etc/security/limits.d/90-nproc.conf
# - nofile - max number of open files
wasadmin soft nofile 65536
wasadmin hard nofile 65536
# - nproc - max number of processes
wasadmin soft nproc 16384
wasadmin hard nproc 16384
to: -
/etc/security/limits.d/90-nproc.conf
Once I did this, and rebooted, everything came up ( not smelling of roses, but working A-OK ), which is nice.
Guess what I've added to my VM template ready for next week ??