Aeotec Z-Wave Gen5+ Stick stays offline after container restart / Raspi reboot

Platform information:

  • Hardware: Raspi 3 B+ (Raspian OS 10)
  • Attached Hardware: Amber Wireless AMB8465 (to read out Wireless M-Bus from water & heat meter), Aeotec Z-Wave Gen5+ (to read out power meter)
  • Docker / Portainer 2.14.0
  • openHAB Software: 3.3.0 Release Build
  • openHAB Bindings: Homematic Binding (logging the heating, controlling lights and blinds), Gardena Binding for Gardena Gateway / smart irrigation control (logging soil humidity), Alexa Binding (controlling the Homematic lights)
  • Homematic IP Hardware: CCU2 (2.59.7), multiple Homematic IP devices

Problem description:
From day 1 onwards, my Z-Wave Stick Gen5+ stays offline sometimes after Raspi Reboot or Container restart, with the error message “Controller is offline”:

2022-06-29 15:27:08.643 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'zwave:serial_zstick:35e12a8479' changed from UNINITIALIZED (DISABLED) to INITIALIZING
2022-06-29 15:27:08.655 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'zwave:serial_zstick:35e12a8479' changed from INITIALIZING to OFFLINE (BRIDGE_OFFLINE): Controller is offline
==> /var/log/openhab/openhab.log <==
2022-06-29 15:27:08.590 [INFO ] [zwave.handler.ZWaveControllerHandler] - Attempting to add listener when controller is null
2022-06-29 15:27:13.659 [DEBUG] [ort.serial.internal.RxTxPortProvider] - No SerialPortIdentifier found for: /dev/ttyACM0

In Portainer, the log contains the following entry (when the stick works), which looks odd to me?
RXTX Warning: Removing stale lock file. /var/lock/LCK..ttyACM0

I did read through all the threads on this topic I could find…

… but am not really sure whether they relate to the same problem (since, in my case, it sometimes does not work).

Doing a reboot sometimes works, sometimes not. Sometimes the stick gets online after a while without doing anything, sometimes it doesn’t.

The fact that it sometimes works and sometimes doesn’t makes me think I’m looking at a bug rather than a configuration problem?

Does anyone have an idea? Needless to say that this is rather annoying.

The “most common” problem with the Aeotec Z-Stick Gen5 is with the first two revisions in combination with Raspberry Pi 4. Since you’re using the Gen5+ stick, it should already have the (hardware) fix, and since you’re using a RPi 3 it should be a problem in the first place. So, I think it’s pretty safe to say that you haven’t got this particular issue.

I recently had the RPi 4/Gen5 stick problem, and while it wasn’t detected most of the time, I did manage to “see” it at least once. So, I know that hardware can cause it to work intermittently. That’s all I have to add, unfortunately. I would try to check if it appears and disappears from ls /dev/tty* when it goes offline in openHAB - it could be a clue to whether the problem is with openHAB or somewhere else.

Thanks, good idea.

Here are my findings.

Facts:

  • The Z-Wave USB Stick is at all times (even when not recognized by openHAB) listed as ttyACM0 under ls /dev/. I therefore also believe it’s not a matter of “is-the-device-recognized-by-the-OS-on-a-low-level” (which appears to be the case, there are also plenty of good-looking entries in the syslog:
Jul  2 20:14:34 raspberrypi kernel: [ 4206.367128] usb 1-1.3: USB disconnect, device number 5
Jul  2 20:14:39 raspberrypi kernel: [ 4211.784142] usb 1-1.3: new full-speed USB device number 7 using dwc_otg
Jul  2 20:14:39 raspberrypi kernel: [ 4211.917443] usb 1-1.3: New USB device found, idVendor=0658, idProduct=0200, bcdDevice= 0.00
Jul  2 20:14:39 raspberrypi kernel: [ 4211.917485] usb 1-1.3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Jul  2 20:14:39 raspberrypi kernel: [ 4211.919956] cdc_acm 1-1.3:1.0: ttyACM0: USB ACM device
  • When it’s correctly loading after container restart / Raspi reboot, the log looks the following:
2022-07-03 13:53:08.359 [INFO ] [ve.internal.protocol.ZWaveController] - Starting ZWave controller
2022-07-03 13:53:08.362 [INFO ] [ve.internal.protocol.ZWaveController] - ZWave timeout is set to 5000ms. Soft reset is false.
==> /var/log/openhab/events.log <==
2022-07-03 13:53:20.498 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'zwave:serial_zstick:35e12a8479' changed from OFFLINE (BRIDGE_OFFLINE): Controller is offline to ONLINE
2022-07-03 13:53:20.518 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'zwave:device:35e12a8479:node5' changed from OFFLINE (BRIDGE_OFFLINE): Controller is offline to ONLINE
2022-07-03 13:53:21.154 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'zwave:device:35e12a8479:node5' changed from ONLINE to ONLINE: Node initialising: REQUEST_NIF
2022-07-03 13:53:22.439 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'zwave:device:35e12a8479:node5' changed from ONLINE: Node initialising: REQUEST_NIF to ONLINE
  • I also believe it does not have to do with the USB-problem reported under the threads above. If this would be the case, the syslog-entries would look different.

Hypothesis:

  • I believe it does have to do with something I’d describe as “routing-the-serial-connection-through to openHAB”.
  • Alternatively it might have to do with what I found in the syslog?
    dockerd[564]: time="2022-05-27T09:45:13.620500774+02:00" level=warning msg="path in container /dev/ttyACM0 already exists in privileged mode" container=9410337d32adb2f10d4a49180971c7fafaea26cd9864dbb59d201d03919c8744

Btw: If anyone has an idea on how to test the stick (when not being recognized by openHAB in Docker) in Raspberry OS, that’d be great. Because if it is working in Raspberry OS, it would mean it has to do something with the “routing to openHAB”.

Update: I found a workaround which always (reproducibly!) solves the problem, though the workaround confirms my thought that there has to be a neat solution to the problem: Every time when I just re-create the entire stack in docker (without changing anything) the Z-Wave stick gets recognized again.

you might find more info if you follow the two links in this post

Thanks, @Andrew_Rowe.

The way I read the issue Serial ports getting blocked after some re-connecting · Issue #1842 · openhab/openhab-core · GitHub , this could be a probable cause of my problem as well. This also means we might just have to wait for an update of NRJavaSerial 5.3, correct?

I read of various workarounds (e.g. here and here), but the best option for a semi-experienced User like me is maybe just wait and deal with it and, in the meantime, just re-create the openHAB container every time the problem occurs, correct?

I’m pretty sure this could be the case. It’s been a recurring problem I’ve seen pop up in the forum numerous times over time.

There is another binding that also used a serial connection and the author of that binding made a change to the serial library used in the binding and it cured the problem. I think it is the Modbus binding maintained by ssalonen ???
Dig a little if you want more info. Search nrjavaserial on this forum’s search and you will see a pile of threads about it. The author of the zwave binding has been super busy with other stuff and may not have the time to change out the serial library the binding uses. Maybe someone else can contribute a fix.

The second thread I linked in that post is not to a openHAB issue, it is an issue raised on the NRJavaserial git and the authors of the library are explaining they can’t figure out where the file leak is coming from and as such, a fix for NRJavaserial may not be imminent.

Oddly, this problem only seems to effect users of the Aeotec stick. I’ve recently saw a thread in the forum stating there are 4 versions of this stick and a link to another forum with details and even a hardware fix that involved adding a jumper or some such

1 Like

Well, I have had this problem for a long time now (whenever the change was), and it is really annoying. But like you say, sometimes it just works. I avoid to reboot the server that hosts oh at all cost, because of this. Sometimes it takes me hours of tedious tries to get the controller online again.

Having said that, last time I rebooted, oh zwave started to work at first go, which meant serious happiness and joy for the rest of the day. Wine bottle opened and instant cheering! :slight_smile:

1 Like

OK, I found the link to forum post on home assistant community forum about Aeotec Gen5 sticks and a lot of info about versions of this stick and why some work and a very fiddily looking hardware fix… check it out

Wild! From what I can tell looking a the link, this is a different problem (which I also read about), related to the Gen5 and USB detection, which results in the stick not working “at all” (USB) instead of “only sometimes” (like in my case, probably related to the NRJavaserial), correct? Discussions on this were the reason why I paid attention to getting the Gen5+ (it even says so on my stick).

Hahahaha! :slight_smile: Same for me, when I rebooted my machine earlier on, because I thought I could reproduce the problem to not work, to test out another fix-idea I had. Not sure if you’re running on docker, but in my case re-deploying the container always helps, maybe also for you.

Aha, no, unfortunately I run oh as a ‘native’ service on an gentoo/Asus PN50 box. But I have, for a while, been thinking about running oh on/in docker instead. Maybe next upgrade to 3.3 is a good time to actually move over to docker.
Thanks for reminding me! :slight_smile:

That is correct
I added the link because there is firmware version info on the different versions and how to figure out which version you may have for future reference

the nrjavaserial issue is kind of an edge case and so has not been addressed. In your case I’m guessing because of running in a docker??? Since it is a Pi, have you considered not running docker and maybe just flash a card with openhabian and see if problem goes away

I’m wondering, if the Aeotec Z-Wave Stick (incependent from Gen5 or Gen 5+) causes so many problems: Is there a reliable alternative around? Or is it that every USB stick will have the same problems?
Interestingly, with my Amber Wireless AMB8465 I’m not experiencing these problems, but then it’s also not running in the openHAB container but in a separate wmbusmeters container, with 100% reliability so far.

Apart from this annoying Aeotec Z-Wave-nonsense, I’m really happy with my docker setup. I wrote myself a short list of instructions on how to come from “empty SD card” to “fully restored system based on automatic daily updates” in < 30 minutes, in case my SD card breaks or I break my system beyond repair. I did a fire drill one month back and it worked pretty well. Let me know if there’s something I can share on that end.

Funny that you say that “OH on docker” is an edge case. I thought it’s the most predominant setup here in this forum. :wink:
Testing things with openHABian could be an option, thanks for the hint. However, I’m running lot of stuff on docker (wmbusmeters for heating and water meter readouts, influxdb instead of RRD4J, mosquitto for Nous plug readings, duplicati for automatic backup) and “testing it for a couple of weeks instead” would unfortunately require lots and lots of upfront work.

I’ve had problems with my Aeotec Stick (recent model) under a docker environment as well. The main problem was a stale lock file, preventing the stick from getting accessible.
And the poor man’s solution was adding an init file under /etc/cont-init.d/10_remove_zwave_lock:

#!/bin/bash -ex

ZWAVE_LOCK="/var/run/lock/LCK..zwave"

if [ -f "$ZWAVE_LOCK" ]; then
  echo "Removing stale ZWave lock file $ZWAVE_LOCK..."
  rm -f ${ZWAVE_LOCK}
fi

Problem solved for me, the lock file is getting removed at each start of the container.

1 Like

correction:
OH on docker running on Pi with Aeotec Gen5 zwave stick having restart issues is kind of edge case is more what I meant

This is definitely the nrjavaserial bug, it is a file leak of the lock files

this is the reference module by silicon labs available at digikey for $40 usd right now, make sure you get the correct one for your region
https://www.digikey.com/en/products/detail/silicon-labs/ACC-UZB3-U-STA/6111632
I use a linear HUSBZ-1 which does zigbee and zwave was only $30 usd back then now $50

purchased in Oct 2018 rock solid

Thanks! Will try this out. Also a good opportunity to learn more on unit files.

Thanks! Will go down this path if the frequent „offlines“ will become too annoying or if I don’t manage to get @Ardanedh‘s solution to work.

Sry, I did understand your post correctly in the first place. I was only surprised to read that, since the Gen5+ Stick is pretty standard for Z-Wave (which itself is pretty standard as a wireless protocol) running on docker (which I understood also many people use), is an edge case in itself. But then again, if „standard“ means only ~20% per case, the entire chain results in 0,2 * 0,2 * 0,2 = ~1% of all installations, hence not more noise on that particular issue in the forum. :wink:

agreed on @Ardanedh work around is very elegant :+1:

What I am ultimately trying to do is draw enough attention to the nrjavaserial issue that hopefully someone with more java programing ability then myself contributes a fix perhaps as was done for the modbus binding. My linked post above is from July 2021 and the original issue was discovered quite prior. Obviously it is still tripping up the lucky few.

Keep in mind that you can’t really draw conclusions about “what most people are using” from discussions in this community, because we rarely hear from people who don’t have any issues. It’s perhaps more accurate to say that discussion here is a reflection of “what people are struggling with”.

Anecdotally, I would say that there are more people talking about the Aeotec Gen5 Z-Wave stick (in all of its variations) than any other controller. I wouldn’t be surprised if it was both the most used device, and the one that causes the most problems for users.

Also anecdotally, I would guess that there are more instances of openHABian than any other setup. Both because openHABian is the logical starting point for new users (particularly those with less technical skill) and because a lot of intermediate/advanced users are comfortable dedicating an RPi to openHAB to keep things simple. However, that might have changed in the past two years due to the shortage of RPis.

Anyway, what I really came here to say is that I use a Zooz USB Z-Wave Plus S2 Stick ZST10 controller and have never had any issues with it. This should not be confused with the Zooz USB 700 Series Z-Wave Plus S2 Stick ZST10, since 700-series controllers are not supported by openHAB at this time.

Yes, Zooz gave their new device an even longer name and the exact same “ZST10” model number. No, Zooz doesn’t really understand how marketing works. Solid controller, though.

Just to note that the binding uses what ever serial library openHAB core provides (through org.openhab.core.io.transport.serial) - it’s not something the binding can change unless we move away from using the OH provided services and directly link a serial library (which was frowned upon in the past).

Personally I’ve stopped using nrjavaserial for other projects as it’s just too much hassle and causes too many problems.

Thanks @rpwong, very true points!

So then there appear to be two good hardware alternatives available (GoControl and Zooz) which have both proven to be working reliably with @Andrew_Rowe and @rpwong respectively, in case @Ardanedh‘s workaround for users of the GEN5+ doesn’t work (which it should) or @Andrew_Rowe not being successful in raising enough attention for the NRJavaserial fix.

I have to say I love this project and this forum! Thanks a lot to all of you! :slight_smile:

3 Likes

Well, from my side, I can say that before some early version 3.x (I have forgotten exactly when this nrjavaserial stuff changed), I never had issues with my serial ports for zwave. I guess it is some kind of race condition.
I would be surprised if Aeotec Gen5 is part of the problem here, like @rpwong says, it is probably the most common stick, but what do I know - it could of course be a combination of the computer and the stick.

Removing the lock file is needed, if it is left, but that is (at least in my case) not the full solution, at least not if I just restart oH. Maybe if I restart the full machine. But since I run oH on a server doing lots of other stuff, I prefer to just try to restart oH a gazillion times until the serial port is working again. But it can literally take hours of trying in worst case. Maybe this alone is a reason for start using docker with oH.

This issue is the only real issue that I have with oH, but since I have not enough knowledge or skill to fix it, I accept it and will not complain - apart from it oH and the zwave binding is the best.
If someone would have a go at a direct binding solution, or a change of the provided serial library into something else, I would be happy to put 100% effort into testing the alternative.