[smartmeter] Openhabian karaf[...]: *** stack smashing detected ***: terminated

I recently decided to delve into the world of ZigBee. Rather than going to something a little old, I decided to upgrade my R-Pi3 to OH4 as a good start. After resolving the memory issues by going 32-bit (Memory usage with Openhabian OH4, Raspberry Pi 3 Model B), I almost got the ZigBee binding working, but not completely ([zigbee] Sonoff Zigbee 3.0 USB Dongle "ZBDongle-E" not working) and was working on implementing zigbee2mqtt when I noticed this really cool message in the daemon.log file:

openhabian karaf[...]: *** stack smashing detected ***: terminated

… is a 3, 4 or 5, digit number. I’m assuming it’s probably the karaf process id. This happens 10-15 minutes or so.

Every time this message occurs, the java process restarts. The context around this message in the daemon.log looks like this:

Nov 18 11:49:21 openhabian karaf[685]: DEBUG: Sending {“request message”: {“device address”: “”}}
Nov 18 11:49:22 openhabian karaf[685]: DEBUG: Received {“identification message”: {“manufacturer ID”: “LGZ”, “protocol mode”: “C”, “baud rate”: 4800, “meter ID”: “ZMF100AC.M29”, “enhanced ID/capability”: “”}}
Nov 18 11:49:22 openhabian karaf[685]: DEBUG: Sending {“acknowledge message”: {“protocol control character”: “NORMAL”, “baud rate”: 4800, “acknowledge mode”: “DATA_READOUT”}}
Nov 18 11:49:22 openhabian karaf[685]: DEBUG: Sleeping for : 250ms before changing the baud rate
Nov 18 11:49:23 openhabian karaf[685]: DEBUG: Changing baud rate from 300 to 4800
Nov 18 11:49:24 openhabian karaf[685]: get_java_var: invalid file descriptor
Nov 18 11:49:24 openhabian karaf[685]: *** stack smashing detected ***: terminated
Nov 18 11:49:25 openhabian systemd[1]: openhab.service: Main process exited, code=killed, status=6/ABRT
Nov 18 11:49:25 openhabian systemd[1]: openhab.service: Failed with result ‘signal’.
Nov 18 11:49:25 openhabian systemd[1]: openhab.service: Consumed 8min 45.733s CPU time.
Nov 18 11:49:30 openhabian systemd[1]: openhab.service: Scheduled restart job, restart counter is at 1.
Nov 18 11:49:30 openhabian systemd[1]: Stopping Frontail openHAB instance, reachable at http://openhabian:9001
Nov 18 11:49:30 openhabian systemd[1]: frontail.service: Succeeded.

Thinking that this may be caused by the new Sonoff ZigBee USB device and/or binding fighting with the long running smartmeter binding, I both removed the Sonoff device and removed the ZigBee binding from OH, but the smashing continues.

I discovered a bunch of hs_err_…log files in the /srv/openhab-userdata directory, like this one:
hs_err_pid18052.log (144.0 KB)

The contents of this file points me towards org.openmuc:j62056, version 2.2.0 according to the smartmeter/pom.xml. Despite its name, this artefact (no longer?) appears to be open source as the website https://www.openmuc.org/ points to GitHub, but this project is not open.

I found this reference Crashes caused by libNRJavaSerial · Issue #136 · NeuronRobotics/nrjavaserial · GitHub to the error message “get_java_var invalid file description” and the (same?) issue was know about in 2020 and fixed in version 3.19.0, around the same time as the code was duplicated into the OH smartmeter binding by @Kai.

Not having access to any of the openmuc source, I can’t see what the fix was or if it may have been applied already. Any help would be most welcome :slight_smile:

I can confirm that issue remains in the way jrxtx/librxtx handles things. I observed the same issue with unofficial wmbus binding made by Kugu Home back in 2018. Restarting of binding (reopening of serial port) caused JVM to crash.
Recently I made an attempt to bind openmuc’s jrxtx with nrjavaserial native library to avoid that trouble. I haven’t tested it in a wild yet, but at least communication with serial port works fine. :wink:

Edit - following up nrjavaserial fix - applies to native part. So if smartmeter uses jrxtx within nrjavaserial it should be fine.

1 Like