- Platform information:
- Hardware: Odroid HC2 (Samsung Exynos5422, Cortex-A15 2Ghz and Cortex-A7 Octa core CPUs, 2GB RAM, boot from sd card, root partition on 2.7TB HDD)
- OS: Armbian 20.05.2
- Java Runtime Environment: openjdk version “1.8.0_252” (Zulu 8.46.0.225-CA-linux_aarch32hf) (build 1.8.0_252-b225) / openjdk version “1.8.0_252” (build 1.8.0_252-8u252-b09-1~deb9u1-b09)
- openHAB version: 2.5.5-1
- Issue of the topic: Java crashes regularly with SIGSEGV
I’ve been running OpenHAB (by way of openhabian sd card image) on a Raspberry PI 3b+ for a few months now, without much problems. On the same Raspberry PI, I have running mosquito and dnsmasq to provide dhcp/dns for my LAN.
I’ve migrated the entire setup to an Odroid HC2 with a 2.7TB HDD to get rid of the SD card in the setup. Installed armbian as a base, then installed openhabian from git as per the instructions at https://www.openhab.org/docs/installation/openhabian.html#other-linux-systems-add-openhabian-just-like-any-other-software. Also have mosquito running on the same HC2, and the dnsmasq moved over as well.
However, openhab is unstable on the new setup. It will crash regularly with a SIGSEGV in java:
Jun 06 18:19:20 house.ow.sono systemd[1]: Started openHAB 2 - empowering the smart home.
Jun 06 18:52:20 house.ow.sono karaf[24085]: Exception in thread "items-4" java.lang.IncompatibleClassChangeError: vtable stub
Jun 06 18:52:20 house.ow.sono karaf[24085]: at org.eclipse.smarthome.core.items.GroupItem.collectStateMembers(GroupItem.java:409)
Jun 06 18:52:20 house.ow.sono karaf[24085]: at org.eclipse.smarthome.core.items.GroupItem.getStateMembers(GroupItem.java:402)
Jun 06 18:52:20 house.ow.sono karaf[24085]: at org.eclipse.smarthome.core.items.GroupItem.stateUpdated(GroupItem.java:372)
Jun 06 18:52:20 house.ow.sono karaf[24085]: at org.eclipse.smarthome.core.items.GenericItem$1.run(GenericItem.java:259)
Jun 06 18:52:20 house.ow.sono karaf[24085]: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Jun 06 18:52:20 house.ow.sono karaf[24085]: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Jun 06 18:52:20 house.ow.sono karaf[24085]: at java.lang.Thread.run(Thread.java:748)
Jun 06 20:59:19 house.ow.sono karaf[24085]: #
Jun 06 20:59:19 house.ow.sono karaf[24085]: # A fatal error has been detected by the Java Runtime Environment:
Jun 06 20:59:19 house.ow.sono karaf[24085]: #
Jun 06 20:59:19 house.ow.sono karaf[24085]: # SIGSEGV (0xb) at pc=0x00000000, pid=24085, tid=0x9b23f470
Jun 06 20:59:19 house.ow.sono karaf[24085]: #
Jun 06 20:59:19 house.ow.sono karaf[24085]: # JRE version: OpenJDK Runtime Environment (8.0_252-b225) (build 1.8.0_252-b225)
Jun 06 20:59:19 house.ow.sono karaf[24085]: # Java VM: OpenJDK Client VM (25.252-b225 mixed mode, Evaluation linux-aarch32 )
Jun 06 20:59:19 house.ow.sono karaf[24085]: # Problematic frame:
Jun 06 20:59:19 house.ow.sono karaf[24085]: # C 0x00000000
Jun 06 20:59:19 house.ow.sono karaf[24085]: #
Jun 06 20:59:19 house.ow.sono karaf[24085]: # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Jav
Jun 06 20:59:19 house.ow.sono karaf[24085]: #
Jun 06 20:59:19 house.ow.sono karaf[24085]: # An error report file with more information is saved as:
Jun 06 20:59:19 house.ow.sono karaf[24085]: # /var/lib/openhab2/hs_err_pid24085.log
Jun 06 20:59:19 house.ow.sono karaf[24085]: #
Jun 06 20:59:19 house.ow.sono systemd[1]: openhab2.service: Main process exited, code=killed, status=6/ABRT
Jun 06 20:59:20 house.ow.sono karaf[2349]: Can't connect to the container. The container is not running.
Jun 06 20:59:20 house.ow.sono systemd[1]: openhab2.service: Control process exited, code=exited status=1
Jun 06 20:59:20 house.ow.sono systemd[1]: openhab2.service: Unit entered failed state.
Jun 06 20:59:20 house.ow.sono systemd[1]: openhab2.service: Failed with result 'signal'.
Jun 06 20:59:25 house.ow.sono systemd[1]: openhab2.service: Service hold-off time over, scheduling restart.
Jun 06 20:59:25 house.ow.sono systemd[1]: Stopped openHAB 2 - empowering the smart home.
/var/lib/openhab2/hs_err_pid24085.log
I’ve compared a few of these hs_err files, and they seem to be all over the place. Sometimes the stack trace is empty, sometimes the stack trace has a bunch of entries, but never the same. In all cases, it seems to be at pc=0 though.
I have a few bindings running:
openhab> bundle:list | grep Binding
136 x Active x 80 x 2.5.0 x openHAB Core :: Bundles :: Binding XML
227 x Active x 80 x 2.5.5 x openHAB Add-ons :: Bundles :: Astro Binding
228 x Active x 80 x 2.5.5 x openHAB Add-ons :: Bundles :: Chromecast Binding
229 x Active x 80 x 2.5.5 x openHAB Add-ons :: Bundles :: MQTT Broker Binding
233 x Active x 80 x 2.5.5 x openHAB Add-ons :: Bundles :: Network Binding
234 x Active x 80 x 2.5.5 x openHAB Add-ons :: Bundles :: Onkyo Binding
235 x Active x 80 x 1.14.0 x openHAB PanasonicTV Binding
236 x Active x 80 x 2.5.5 x openHAB Add-ons :: Bundles :: Volvo On Call Binding
openhab>
Sometimes, it will run fine for a few hours, then crash. Sometimes it will crash within seconds of start-up or during start-up. I have yet to find a pattern.
I tried disabling all bindings, then enabling them one by one. At first, it would crash immediately after enabling the astro binding, but after two times, it would not crash anymore after re-enabling the astro binding.
I’ve tried replacing the openhabian-provided jdk with the armbian stock openjdk from apt. No difference.
I’m now trying forcing the java process to the ‘big cores’ (apparently, the cpu used in the HC2 has 4 coretex-a15’s and 4 coretex-a7’s) as per https://www.j-dimension.com/java-process-crashing-on-odroid-hc1-xu4/ (the XU4 is basically the same hardware as the HC2). Has been running for about 15 minutes now, so the jury is still out.
Anyone experienced this before? I’ve searched to forum here, and there seem to be similar cases:
-
Here it was suggested pinning to big cores, seemd to help at least one person
*SD card corruption, but I run from HDD, not SD card
*From 2015, doesn’t seem overly relevant and suggests downgrading to jdk 7
There are more, but they seem to post to specific, consistent crashes or are so old that they are not relevant anymore.
Anyway, I don’t expect anyone will be able to provide a magic solution off-the-cuff, just posting in the hopes someone somewhere had this exact same issue and found a solution, or has a hint on how to further debug this issue. I’d like to continue using openhab, since migrating to something else is going to be a big pain. But if I can’t solve this issue, then this setup is not usable.