After a reboot yesterday (Ubuntu kernel upgrade), something like two-thirds of my Z-wave network is disfunctional. Looking at the log, the Z-wave controller is ignoring messages from my nodes because they are uninitialized,
After that reboot ~2 days ago, this has been the situation. I have tried rebooting multiple times, restarting OpenHAB (official version 2.2) to no avail. And waiting for initialization to happen for days. I have managed to get one Fibaro Dimmer 2 up and running by triple-clicking the switch, then it apparently managed to initalize, but I could not repeat on the next dimmer. A Fibaro Wall plug I did a full re-inclusion on, and that one is functioning.
The question is what to do next? Can this be fixed, or do I need to rebuild the entire Z-wave network from scratch (this will be many days of work)? Can I do something to understand why it happened, to ensure it will not happen again?
Looking at the log, the unitialized nodes seem to send messages (node 28 e.g is a Aeotec Multisensor 6, reporting the temperature), but they are ignored by the controller (Aeotec z-Stick Gen 5)
Are the nodes that aren’t initialising all battery devices? Are they waking up at all - if not, they won’t initialise if the persistence file was somehow deleted. If this is the case, you should wake up the sleeping devices (possibly a few times) so that they can be initialised.
Attached is also a 1000-line z-wave debug excerpt (of which the above log view is taken from).
I also guess the Network view in my first post shows something quite strange - the nodes should not be daisy-chained from the controller. Previously when the network was working fine, almost all nodes had direct connnection with the controller, even though the controller is non-optimally placed in the house (in a concrete garage attached to the house).
Not much to comment on with this - there’s only a single message being sent in this log, and it gets a response. I guess this short log is after the initialisation, and there are no commands being sent so very little is happening.
If the controller is reporting it only knows about 14 nodes, then only these 14 nodes will work. This probably explains the problem you’re seeing.
I don’t think there’s any way out if the controller doesn’t know about the devices any more - they will likely need to be reincluded. Of course the bib question is why it changed - maybe the memory in the controller has an issue - who knows…
Thanks for the comments. I guess I’ll bite the bullet and rebuild the network after a factory reset of the stick.
It the stick is not to be trusted, I guess a mitigating action would be to buy a second spare stick, and keep a backup of the stick data (using the Aeotec software on some Windows computer).
One time I had a sequential block of nodes drop off the network from node 224-232. I was ready to rebuild the network, but on a whim I tried just putting the controller and devices into inclusion mode and they all rejoined. I can’t explain how/why this worked, but it never reoccurred and everything has worked fine since. Maybe this would work for you.
I had initiated the inclusion through OH, which is the safest way to do inclusion. Otherwise, you run the risk of an unhealthy mesh, which, as I understand it, cannot be healed through OH unless you are running the development zwave binding. Still, it can take a while to straighten it out.
I have never done inclusion through OH, always with the stick unplugged, maybe because I didn’t manage it in the first place. Trying to do it now I am getting errors that the binding/thing does not support discovery;
karaf> discovery start zwave:serial_zstick:ff1568d3
log: 16:23:42.857 [WARN ] [internal.DiscoveryServiceRegistryImpl] - No discovery service for thing type 'zwave:serial_zstick:ff1568d3' found!
karaf> discovery start 242 # for the z wave bundle
log: 16:25:18.248 [WARN ] [internal.DiscoveryServiceRegistryImpl] - No discovery service for binding id '242' found!
And I get a similar red error message box from habmin while trying to click the magnifying glass with a ‘+’ sign in it.
Now that 2.3 is out, I will first upgrade to that.
Upgrade to 2.3 and rebuilt Z-wave network was done (many days ago). Topology looks much better now. For most of the nodes, inclusion via the binding was ok for most devices, but not all. Thanks for that tip, that saved me a LOT of time.
If you plan to have nodes powered off for a long while, consider excluding them and reincluding when you want to use them. Or just leave them powered up. I’ve run into routing issues when I’ve powered devices off for a while… like some power monitors that I basically use for testing. Your node 15 looks like it could be such a problem.
No, it is not intentional to keep them offline, some await available time on my behalf, at least one is awaiting an electrician for final installation, and some red nodes are not really dead, it is just the binding that lost temporary contact - so it seems.