Zwave Network: stability issue

ariela · March 23, 2018, 3:45pm

Hi folks,

frankly I don’t know what’s happening, I hope you can help me.

Symptom: randomly zwave nodes (battery or not) are not communicating with the controller.

System: hopenHAB 2.3.0 1234 (openhabian)

zwave devices: 7 Fibaro FGWP101 and 5 Danfoss L-13
controller: AeonStick S2

I can’t see any strange log in my trace. Sensation is the controller still have some nodes configured (even after multiple hard reset; I’ve also removed all nodes.xml and re-included from scratch) and there are issues there.

Not sure. Any suggestion how to troubleshoot this issue? of course I can’t have nodes going up and down randomly, I need to stabilise somehow.
I’ve also considered it was a RAM issue … but I’ve added 512Mb in my virtual machine, and now we have 1Gb RAM.

Any idea will be much appreciated.

Thanks
Andrea

ariela · March 23, 2018, 3:54pm

More details:

interesting, I see even the nodes considered down by Paper UI or HABmin are somehow reachable …

1 I’ve sent a Power reset to a wall plug tagged as down, and the node goes up immediately
2 same with a Danfoss, changing manually the temperature, the item has been modified accordingly and the node goes up

I can understand this behavior for all nodes with batteries (like 2). Not too much for nodes like 1: I expect those plugs should be UP always.

Does it make any sense?
Andrea

5iver · March 23, 2018, 5:08pm

Do you mean they are being marked as dead? This is a known issue that is hopefully one of a few things left that @chris had mentioned would be getting corrected before merging the development version of the zwave binding into master. As you’re seeing, the devices are still on the network and responsive, but for some reason the controller is reporting them as dead when asked for their status. If you don’t have a hardware issue with one or more of your devices, an OH restart usually clears it up. Sometimes only temporarily. I find if I use a device a lot, this also sometimes brings it back from the dead. But I have also seen energy meters that are marked as dead but come back to life every 10s when they report, then go back to dead. For me, this has only been a cosmetic issue.

I’m on 2.3.0 1238 and using the latest development build of the zwave binding.

ariela · March 23, 2018, 7:11pm

Thanks for your answer.

Yes, I don’t have any hardware issue, so this is for sure a bug.

This is definitely unpredictable. Now everything seems fine, but randomly there are nodes marked as DEAD.

We’ll see in the future. openHAB 2.3.0 1238 also here.

thanks
Andrea

chris · March 23, 2018, 7:19pm

Yes, this can be correct. It is possible to send a message to a device, but if the device does not respond, or at least the response is not received, then it will be marked DEAD.

This is normally caused by devices that do not respond. If the device doesn’t respond, then it gets marked as DEAD. This doesn’t stop the binding from sending messages to the device (otherwise it would never be able to come back online ) but it will reduce retries to limit the “damage” to the network of a non-responding node.

If a node is periodically not responding, then it could indicate it’s a bit marginal with it’s link to neighbours and adding more devices might help.

I’m not really sure what you are saying is a bug? Please provide a clear explanation of the issue, and provide a debug log showing the problem.

ariela · March 23, 2018, 7:28pm

Hi chris,

sorry, “bug” is not the right word. You explained the behavior, and I can understand.

But I would suggest to implement a sort of mechanism “DEAD Peer detection” via software before consider the node DEAD. Maybe 5 retransmissions every retry-interval seconds? quite usual solution for a network protocol.

Does it make any sense?
Andrea

chris · March 23, 2018, 7:59pm

Yes - this is what is implemented. There are 3 retries, and if the device does not respond after 3 retries, it is marked dead. Once it is dead, then there a 0 retries until the device is alive again.

ariela · March 24, 2018, 9:53am

Understood,

but the node marked as DEAD is not sending any update regarding, for example, the real-time power consumption of the fridge attached. I can “wake up” the node doing a switch OFF - turn ON via HABmin, that is not good, you know, for the fridge itself

Do you think this will be fixed in the future?

thanks
Andrea

chris · March 24, 2018, 1:50pm

I don’t know what you are referring to exactly? Will what be fixed?

ariela · March 24, 2018, 2:31pm

I’m doing a test.

Now my wall plug connected to the fridge is OFFLINE, but in facts this is alive.
I can see 1.6 as power sensor.

Switching OFF the Fridge (my fridge can do this manually inside the fridge itself), now I see 4.4.

So, even if the item has been updated (so you are right, no fix is needed), the value makes no sense It should be 0, right?

Powering ON, now it’s 1.5

Not understanding, sorry. I was expecting 0, I mean … if the fridge will be powered off, the power should be 0

mmmm …

chris · March 24, 2018, 2:50pm

I don’t really know how to answer. If the value is changing, then probably it’s coming from the device and the binding is just displaying this value.

I would strongly suggest to use the debug logs when looking for issues like this. That is the only way to know what is really happening. If the device is returning 4.4W, then the binding will display 4.4W. My guess is that switching the fridge off from inside the fridge might not completely power everything down. Alternatively, the device itself might not measure with good accuracy at low power (this is not uncommon). Finally, maybe there’s a bug in the binding (but I think it’s probably unlikely that the binding would just give you the wrong value - this is probably a device issue).

ariela · March 24, 2018, 2:56pm

I will do more test and let you know.

In the meantime, THANKS so much for the support

Andrea

chris · March 24, 2018, 2:58pm

No problem - in case you don’t already know, there is a log viewer here. This will view the ZWave data in a nicer way than the text file…