Slow Z-wave network

Hello!

My Z-wave network is very slow and unresponsive and it is related to me putting the controller in a different location from where it used to be. I would like to be able to pin down the exact problem so that I can see if there is some other way to solve it as I really don’t want to have the controller where it used to be.

I am running OpenHAB 2.5.1 with the Z-Wave binding version 2.5.1 on an Intel NUC (x86_64) running Ubuntu. My Z-wave controller is an Aeon Labs Z-Stick (Gen 5). I have no battery powered devices. This is my list of devices:

That screenshot is taken around 15 minutes after rebooting the entire NUC. I know nothing about the Z-Wave protocol, but isn’t it strange that all the devices aren’t initialized after 15 minutes when there are no battery powered ones? Also I don’t understand why the controller placement makes such a difference. Isn’t this a mesh network protocol, which means it should be sufficient if the controller can reach only one of the devices and the packets can get relayed through that device? It seems that the network is idle 99.9% of the time when looking at the Z-Stick LED, but still almost nothing happens. It can take several minutes from one node changing state (between PING, REQUEST_NIF and ONLINE) until the next node changes state. Why doesn’t it try to communicate with the nodes quicker in the beginning?

I have waited two days after moving the controller so it should have performed two nightly network heals.

What information do I need to gather to help you answer the questions? Any steps I shall perform and post logs after doing it?

I have good general technical knowledge in programming, electronics, embedded devices and so on so don’t be afraid to explain techical details. I know almost nothing about the Z-wave protocol and the software stack however so I don’t know where to start looking. I actually programmed a Z-wave controller and device as a project back at the university 10+ years ago but unfortunately I remember nothing of it :slight_smile:

Now 20 minutes later, after writing this message, there is still one device stuck in REQUEST_NIF.

Thanks for the help in advance!

/Niclas

While zwave is a mesh, I understand it also involves learned pathways. Which you’ve changed. You probably need to look into “heal” utility.

I think they did that. :wink:

You might try updating to 2.5.2 that was just released. I think there were some timeout changes made in the binding that may help.

Thank you for the tip. Will try upgrading asap!

(I’ve never written on this forum before. Where do I find the option to quote a part of a post?)

Highlight the portion of a post you’d like to quote and you will get a control will appear to quote it.

You may want to go through the unfortunately very much unadvertised interactive tutorial…

Just send one of the following in a private message to @discobot to start the tutorial…

New User
start tutorial

Advanced User
start advanced tutorial

2 Likes

Thanks! That was quite funny :slight_smile:

1 Like

I have now tried the latest update to OpenHAB and the Z-wave binding and unfortunately it did not affect our problem. What is the best approach in figuring out what is going on under the hood?

buy yourself a UZB-3 or similar, flash with the zniffer firmware and get zniffing what is going on. It is the only way to see what is going on under the bonnet.

sidewalk pavement

1 Like

Thanks for the tip! By chance I just stumbled over your fine posts about Z-wave routing and am reading the second one right now. Very appreciated information! I have just installed the PC Controller and am playing around with it.

Hello again,

My network worked ok for some time, but some days ago we had some power outages so the controller and all devices restarted a few times. After this the network has been totally unresponsive. I am not able to control any devices at all through the app or through Paper UI. Even after waiting two days from a reboot there are still a lot of devices stuck in “Node initialising: PING”, and some of them are REALLY close to the controller.

I set the loglevel to DEBUG and rebooted the controller. I have attached the log from the first ten minutes. Can anyone see any explanations to my issues in the log? @chris?

Edit: Actually the log was too big to attach. Here is a link: https://drive.google.com/file/d/1sQhEiMB9dU8DnAtQvyJ2DxrwFIjqBR5_/view?usp=sharing

Regards,
Niclas

That is a lot of noise.

Did you manage to get a zniffer?

I have ordered one but it hasn’t showed up yet.

A few minutes into the log I powered up my multisensor 6 (node 24) which I previously saw generated a lot of traffic. Every ten seconds it seems to send a new luminance update which is fine, but it seems to send that same value 14 times every ten seconds. Why is that? I have seen the same behavior from other devices that send values like power reports.

It’s impossible to tell from the logs, however it’s likely that something in the network is performing retries and the controller therefore receives multiple frames. This is likely to be the root cause of your issue - ultimately this is flooding the network, overloading the controller, and then the controller is rejecting messages that the binding is sending as it is too busy to process them.

A sniffer may allow you to work out what is happening and what node is causing the trouble.

2 Likes

Until it arrives all I can suggest is

Work through all devices and settings and turn reporting as low as you can.

Turn polling down.

Turn off poll after.

Check your scripts and do not send if the device is already at that state.

nothing helps, removed and readded all devices after upgrading from 2.4 to 2.5.4 and now my Zwave network is not really working any more. it seems there is an issue in 2.5.4 that puts a lot of traffic in the background to the zwave network, the normal log does not show lot of action

There really have been very very few changes to the binding for some time now, and if there is more data on the network, then this should not be related to the binding - the log will show exactly what the binding is doing - nothing more, and nothing less - so you can be sure this is what the binding is doing :wink:

A thread leak in the miio binding was just fixed, Are you running that?

I have troubleshooted my network for some time now. Yesterday I excluded all the devices, performed a factory reset of my Z-stick and then included all the devices again. I think there was some improvement, but there are still issues. It just seems that lots of times, nothing happens on the network and nothing happens in the binding. Here is a log of me turning off (or turning on, don’t remember) a node slow-polling.txt (41.8 KB).

The actual action happens quickly, but then the polling takes many seconds. That way the network is totally unresponsive when performing any more actions immediately after the first one, for example realizing I turned on the wrong light and turn it off again :slight_smile:

That does not appear to be a debug log as specified in the binding documentation for troubleshooting. purposes.

It is. It is both a debug log and a zniffer log. The zniffer log comes first.