I recently decided to delve into the world of ZigBee. Rather than going to something a little old, I decided to upgrade my R-Pi3 to OH4 as a good start. After resolving the memory issues by going 32-bit (Memory usage with Openhabian OH4, Raspberry Pi 3 Model B), I almost got the ZigBee binding working, but not completely ([zigbee] Sonoff Zigbee 3.0 USB Dongle "ZBDongle-E" not working) and was working on implementing zigbee2mqtt when I noticed this really cool message in the daemon.log file:
openhabian karaf[...]: *** stack smashing detected ***: terminated
⊠is a 3, 4 or 5, digit number. Iâm assuming itâs probably the karaf process id. This happens 10-15 minutes or so.
Every time this message occurs, the java process restarts. The context around this message in the daemon.log looks like this:
Nov 18 11:49:21 openhabian karaf[685]: DEBUG: Sending {ârequest messageâ: {âdevice addressâ: ââ}}
Nov 18 11:49:22 openhabian karaf[685]: DEBUG: Received {âidentification messageâ: {âmanufacturer IDâ: âLGZâ, âprotocol modeâ: âCâ, âbaud rateâ: 4800, âmeter IDâ: âZMF100AC.M29â, âenhanced ID/capabilityâ: ââ}}
Nov 18 11:49:22 openhabian karaf[685]: DEBUG: Sending {âacknowledge messageâ: {âprotocol control characterâ: âNORMALâ, âbaud rateâ: 4800, âacknowledge modeâ: âDATA_READOUTâ}}
Nov 18 11:49:22 openhabian karaf[685]: DEBUG: Sleeping for : 250ms before changing the baud rate
Nov 18 11:49:23 openhabian karaf[685]: DEBUG: Changing baud rate from 300 to 4800
Nov 18 11:49:24 openhabian karaf[685]: get_java_var: invalid file descriptor
Nov 18 11:49:24 openhabian karaf[685]: *** stack smashing detected ***: terminated
Nov 18 11:49:25 openhabian systemd[1]: openhab.service: Main process exited, code=killed, status=6/ABRT
Nov 18 11:49:25 openhabian systemd[1]: openhab.service: Failed with result âsignalâ.
Nov 18 11:49:25 openhabian systemd[1]: openhab.service: Consumed 8min 45.733s CPU time.
Nov 18 11:49:30 openhabian systemd[1]: openhab.service: Scheduled restart job, restart counter is at 1.
Nov 18 11:49:30 openhabian systemd[1]: Stopping Frontail openHAB instance, reachable at http://openhabian:9001âŠ
Nov 18 11:49:30 openhabian systemd[1]: frontail.service: Succeeded.
Thinking that this may be caused by the new Sonoff ZigBee USB device and/or binding fighting with the long running smartmeter binding, I both removed the Sonoff device and removed the ZigBee binding from OH, but the smashing continues.
I discovered a bunch of hs_err_âŠlog files in the /srv/openhab-userdata directory, like this one:
hs_err_pid18052.log (144.0 KB)
The contents of this file points me towards org.openmuc:j62056, version 2.2.0 according to the smartmeter/pom.xml. Despite its name, this artefact (no longer?) appears to be open source as the website https://www.openmuc.org/ points to GitHub, but this project is not open.
I found this reference Crashes caused by libNRJavaSerial · Issue #136 · NeuronRobotics/nrjavaserial · GitHub to the error message âget_java_var invalid file descriptionâ and the (same?) issue was know about in 2020 and fixed in version 3.19.0, around the same time as the code was duplicated into the OH smartmeter binding by @Kai.
Not having access to any of the openmuc source, I canât see what the fix was or if it may have been applied already. Any help would be most welcome