Short version: I have some flaky devices that periodically fail and need to be reintegrated with the network. However, the Z-Wave binding seems convinced that they’re still alive, even when the devices themselves are powered down or factory-reset. How do I get the binding to accept that such a device is, in fact, dead, allowing me to invoke the replace-failed-node workflow?
OpenHAB 4.3.1 using a Aeon Aeotec Z-Stick (v5, I think)
Longer version: I have a number of POPP/Danfoss/Devolo TRVs[1] which seem prone to losing their connection to the Z-Wave network when the battery fails. I’m generally not able to recover them using OpenHAB: I’ve use the (now-defunct) Open ZWave Control Panel (ozwcp) to execute the following sequence of actions:
factory-reset the TRV - this usually isn’t necessary as it seems to get into this state itself
shut down OpenHAB
allow ozwcp to initialize the Z-Wave controller. it will try to initialise the broken TRV as part of this, but will generally not mark it as failed.
select the broken TRV
run ozwcp’s “has device failed” command
ozwcp will flag the device as DEAD (this may take a couple of attempts, since my understanding is that (a) Z-Wave prefers not to assume battery devices are dead and (b) the process of marking something as DEAD requires it to fail to respond to some number of communication attempts)
run ozwcp’s “replace failed device” command
press the inclusion button on the device
ta-da! device is back on the network
restart OpenHAB
Obviously having to shut down OpenHAB to do this is a bit annoying and I’d rather be able to do it from within OpenHAB. I’m currently looking through the code for the binding and it looks like both “has device failed” and “replace failed device” are implemented, but it looks like “has device failed” is only run as part of the “Set device as FAILed” action, and is run after “replace failed device”; there’s no independent way I can see to trigger the “has device failed” query. I’m honestly not even sure if this is needed, but it does seem to be a necessary part of the ozwcp process to get the controller to admit that the device is in fact dead.
(chasing further: “replace failed device” calls “requestSetFailedNode”, which sends “ReplaceFailedNodeMessageClass”, which is a tiny bit confusing - is there a reason it’s done this way?)
Short version: It is nearly impossible to remove a dead battery device with the Zwave binding.
Because they are thought to be “sleeping” the check “if failed message” will not be sent. You could try to put the controller in exclude devices mode and press the button on the battery device. However, if you have already included it again, it will likely remove the new one. I’d recommend the use of the Silabs Simplicity studio. It doesn’t care if it is a battery device. Also note nodes are stored on the controller, not in OH. That is why if you try “Delete Thing” they will pop back up when you scan
That doesn’t really help, I’m afraid, since it leaves me with the problem of having to turn off OpenHAB in order to fix the network, which is what I’m trying to avoid having to do. Is this a known bug in the driver? As described, I’m able to use a different tool to get the desired outcome, and you’ve suggested a third tool, so that suggests that it’s definitely possible to do this, but that the OpenHAB implementation doesn’t support it?
In theory it is supported, in practice it is not. Also, the problem is with battery devices, not powered nodes. If powered nodes do not respond they can be removed.
Can you elaborate on “in theory it is supported”? How is it supposed to be used, so I can at least try that out?
(also, again, “the problem is with battery devices, not powered nodes” doesn’t really help. I’m not having a problem with powered nodes, as I indicated in the original post. Don’t get me wrong, I appreciate your desire to help.)
What has to happen is that the battery node fails to wake up within 2x the wake frequency (this is from memory, not the developer, but have some experience with the binding). However, any restart of OH within that time period and the timer is stopped. Basically the node itself has to declare itself dead, sending the check if failed will never work for a battery with the OH Zwave binding, sorry.
You could try a test; include a battery device, make sure it is fully configured, then pull the battery. It should get marked as dead in a day or two. Don’t stop or restart OH during the test period. Then it can be deleted.
Ok, so one of these devices has been offline since December 27th so I think that’s fairly reliably proving this doesn’t work. I guess I can file an issue and see what happens.
Ok. Just check under device properties for last wake-up, since that what the timer is supposed to trigger off of. Also note the wake-up frequency in the issue. Lastly, if the Ui doesn’t have the “reinitialize device” bar at the bottom the node is not fully configured. Provide all that with the issue.