I’ve got RPi4B with Openhabian (OH 4.3.2). Aeon ZW090 controller and maybe 19 nodes. Zwave network is reliable. However, after running for several days, one or two nodes stop reporting local status changes. At least, it appears that the node does not report.
Not always the same devices. Taking the thing offline/online fixes the issue for a few days. Or taking the controller offline/online fixes the issue for a few days.
I don’t recall having this issue with OH 2.5.1 for years, though I would reboot the RPi every week.
I put the Z-wave binding in Debug logging mode. Nodes 12 and 4 were toggled locally but the item linked to each did not change status. Nodes 28, 35 & 33 performed fine.
I noticed your devices use a command (HAIL) that I haven’t seen before. Reading up on it, it is old and depreciated but means the device asking to be polled (because something changed). The command is supported, but only if the device is fully configured. It appears from the log nodes 4 and 12 haven’t finished the heal commands. I would disable the nightly heal and see how things recover over time. Just heal a particular device if needed, not the whole network. There is probably some network congestion too, but I’d start with that.
I’ve been running with Heal Time disabled in the Z-Wave Controller Thing for about two weeks. I have not had any issues with local status changes not updating items. However, I have also restarted the RPi4 twice during that time.
Not sure what exactly was the root cause but looks good so far. I’ll update this if something changes.
So, I allowed the system to run without restarting for about a week and noticed a recurrence of the issue – local status changes not updating items. I tried healing the nodes that were affected, but no luck. I restarted the system and that seemed to restore normal behavior.
I don’t have the time to track this down but I am mitigating the issue. I have a cron job to restart the system once each week.
However, there must be something else going on. Based on the HAIL theory and log file above, restarting means the device will need to be fully configured again before responding so should aggravate, not fix the problem. The same for healing (either by node or the whole network).