My large OH system running on Debian has started to crash several times a week. To try and alleviate this I’ve updated the OH, the OS and the JVM which has if anything made it worse. The underlying operating system survives the crash and in the messages file it says something along the lines of “the main thread has exited, restarting.” As there was a recent power cut I’ve booted up another copy of the OS and checked the disks but this hasn’t fixed the problem. The last time it died was in the early hours of the morning so user interaction can be ruled out.
What information is needed to help with debugging this? I have a suspicion I’ve hit some sort of resource problem as it seems to have worsened with the last couple of rules I’ve enabled. The crash information taken from syslog is:
Aug 9 03:41:36 MySystem karaf[17630]: # A fatal error has been detected by the Java Runtime Environment:
Aug 9 03:41:36 MySystem karaf[17630]: #
Aug 9 03:41:36 MySystem karaf[17630]: # SIGSEGV (0xb) at pc=0x00007f0381cf34e3, pid=17630, tid=17762
Aug 9 03:41:36 MySystem karaf[17630]: #
Aug 9 03:41:36 MySystem karaf[17630]: # JRE version: OpenJDK Runtime Environment Zulu11.58+15-CA (11.0.16+8) (build 11.0.16+8-LTS)
Aug 9 03:41:36 MySystem karaf[17630]: # Java VM: OpenJDK 64-Bit Server VM Zulu11.58+15-CA (11.0.16+8-LTS, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
Aug 9 03:41:36 MySystem karaf[17630]: # Problematic frame:
Aug 9 03:41:36 MySystem karaf[17630]: # V [libjvm.so+0xa5a4e3] LoaderConstraintTable::purge_loader_constraints()+0xc3
Aug 9 03:41:36 MySystem karaf[17630]: #
Aug 9 03:41:36 MySystem karaf[17630]: # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
Aug 9 03:41:36 MySystem karaf[17630]: #
Aug 9 03:41:36 MySystem karaf[17630]: # An error report file with more information is saved as:
Aug 9 03:41:36 MySystem karaf[17630]: # /home/pi/openhab/var/lib/openhab/hs_err_pid17630.log
Aug 9 03:41:36 MySystem karaf[17630]: #
Aug 9 03:41:36 MySystem karaf[17630]: # If you would like to submit a bug report, please visit:
Aug 9 03:41:36 MySystem karaf[17630]: # http://www.azul.com/support/
Aug 9 03:41:36 MySystem karaf[17630]: #
Aug 9 03:41:36 MySystem systemd[1]: openhab.service: Main process exited, code=killed, status=6/ABRT
Aug 9 03:41:36 MySystem systemd[1]: openhab.service: Failed with result 'signal'.
System details:
X86 Intel Based Fanless PC with M2 SSD and 8GB RAM running:
Linux 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64
OH 3.3 release build, 177 things, 1320 items, 344 rules.
Problem was present on one of the recent milestones, unfortunately didn’t note it before I updated. JVM seems to be the latest Java 11 version.
Released Bindings loaded:
Astro, Automower, Daikin, Exec, HTTP, Hue, ipcamera, Modbus, MQTT Network, OpenWeatherMap, Pushover, RFXCOM, TR-064, Wemo, Z-Wave
Marketplace bindings:
Samsung TV Beta
Manually loaded Bindings:
iRobot, LG Thinq
Saved file from the crash is attached.
I have only serial port settings in /etc/default/opehab and have added nothing to /usr/lib/systemd/system/openhab.service. The system has 8GB RAM and currently after 12 1/2 hours uptime top shows this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26643 openhab 20 0 6657672 2.0g 28952 S 28.4 26.3 178:24.51 java
hs_err_pid17630.log (238.2 KB)
Hopefully someone can point me in the right direction to fixing this.
Thanks in advance!
In the meantime I’ve removed the marketplace Samsung TV binding to see if anything changes as it’s the last one I added