I can’t see any strange log in my trace. Sensation is the controller still have some nodes configured (even after multiple hard reset; I’ve also removed all nodes.xml and re-included from scratch) and there are issues there.
Not sure. Any suggestion how to troubleshoot this issue? of course I can’t have nodes going up and down randomly, I need to stabilise somehow.
I’ve also considered it was a RAM issue … but I’ve added 512Mb in my virtual machine, and now we have 1Gb RAM.
interesting, I see even the nodes considered down by Paper UI or HABmin are somehow reachable …
1 I’ve sent a Power reset to a wall plug tagged as down, and the node goes up immediately
2 same with a Danfoss, changing manually the temperature, the item has been modified accordingly and the node goes up
I can understand this behavior for all nodes with batteries (like 2). Not too much for nodes like 1: I expect those plugs should be UP always.
Do you mean they are being marked as dead? This is a known issue that is hopefully one of a few things left that @chris had mentioned would be getting corrected before merging the development version of the zwave binding into master. As you’re seeing, the devices are still on the network and responsive, but for some reason the controller is reporting them as dead when asked for their status. If you don’t have a hardware issue with one or more of your devices, an OH restart usually clears it up. Sometimes only temporarily. I find if I use a device a lot, this also sometimes brings it back from the dead. But I have also seen energy meters that are marked as dead but come back to life every 10s when they report, then go back to dead. For me, this has only been a cosmetic issue.
I’m on 2.3.0 1238 and using the latest development build of the zwave binding.
Yes, this can be correct. It is possible to send a message to a device, but if the device does not respond, or at least the response is not received, then it will be marked DEAD.
This is normally caused by devices that do not respond. If the device doesn’t respond, then it gets marked as DEAD. This doesn’t stop the binding from sending messages to the device (otherwise it would never be able to come back online ) but it will reduce retries to limit the “damage” to the network of a non-responding node.
If a node is periodically not responding, then it could indicate it’s a bit marginal with it’s link to neighbours and adding more devices might help.
I’m not really sure what you are saying is a bug? Please provide a clear explanation of the issue, and provide a debug log showing the problem.
sorry, “bug” is not the right word. You explained the behavior, and I can understand.
But I would suggest to implement a sort of mechanism “DEAD Peer detection” via software before consider the node DEAD. Maybe 5 retransmissions every retry-interval seconds? quite usual solution for a network protocol.
Yes - this is what is implemented. There are 3 retries, and if the device does not respond after 3 retries, it is marked dead. Once it is dead, then there a 0 retries until the device is alive again.
but the node marked as DEAD is not sending any update regarding, for example, the real-time power consumption of the fridge attached. I can “wake up” the node doing a switch OFF - turn ON via HABmin, that is not good, you know, for the fridge itself
I don’t really know how to answer. If the value is changing, then probably it’s coming from the device and the binding is just displaying this value.
I would strongly suggest to use the debug logs when looking for issues like this. That is the only way to know what is really happening. If the device is returning 4.4W, then the binding will display 4.4W. My guess is that switching the fridge off from inside the fridge might not completely power everything down. Alternatively, the device itself might not measure with good accuracy at low power (this is not uncommon). Finally, maybe there’s a bug in the binding (but I think it’s probably unlikely that the binding would just give you the wrong value - this is probably a device issue).