ZWave: system migrated: on/off sensors now very slow

I intuitively always connect USB 2 devices to USB 2 ports and USB 3 devices to USB 3. :relaxed:

1 Like

I replaced my USB extender cable with a new ‘high quality shielded’ USB 2.0 cable (with only 4 pins/cores and external shield), and plugged it into one of the Pi4 USB 2.0 ports.

From our discussions, I think this must be the best possible configuration for minimum interference (best shielding, lowest number of cores, lowest number of Pi4 USB busses touched, and lowest Pi4 touched bus clock rate). Unfortunately it does not solve my original (slowness) problem. So I guess the slowness is not due to interference after all.

My feeling is that the slowness is because the sensor’s ‘Z-Wave Instant’ transmissions are not getting through, so it is falling back to the ‘Z-Wave Polling’ transmissions instead; which obviously take longer.

I don’t see anything in the Configuration Parameters that may have changed in the migration from OH2 to OH3 relating to ‘Instant vs Polling’ transmissions. Did something change in the OH3 version of the binding that could be causing this?

Did you ever do any of my suggestions?

Sorry if I missed this but how many devices are we talking about here? Also, what is the range they are operating in? (one floor, two floors, just general distance for radios to operate in) How many mains powered devices, how many battery powered devices?
As I read this thread, I thought it isn’t any of this stuff with cables or ports. I think you have ghost nodes or something causing a lot of traffic. Could be a node that is reporting every few seconds swamping the network. Maybe the new stick caused a perimeter to be reset??? Guessing, but set log to debug and watch traffic. Supply info on above questions and we’ll figure it out

I did already try the trace logging on the z-wave binding but couldn’t see much to indicate potential problems. But to be honest I don’t know really what messages to search for ??

And I just tried ‘top’ and the biggest CPU consumer is the openhab process which is around 30% – so I guess that is not the problem?

Many thanks for the offer

  • not that many: 17 devices, of which 8 are battery powered
  • a regular house, 3 levels, brick construction, wood floors, ~15m corner to corner

Could you please explain what that even means?

There is one node in the stick that does not correspond to a physical device; and I don’t know how to remove it.

Ok. Will do.

Errr. Forgot to ask: what shall I look for?

^
PS let me add some more details…

  • On my OH2 system, I had most of the devices polling period set to their (fibaro & aeotec) default values of 24 hours, and I was relying on the ‘z-wave instant’ transmissions to get the data through. The response times where in the order of seconds.

  • On my OH3 system the ‘z-wave instant’ transmissions seem not to be that ‘instant’. The status does come through (and I don’t have to wait 24 hours for the poll). But takes ~10 minutes rather than just a few seconds.

  • Therefore on my OH3 system I am experimenting with reducing the polling period to 30 seconds (at least on the powered devices). That means that I do get the statuses in ~30 seconds again. But still slower than the several seconds I was used to…

I am getting trace logs at a rate of 80 MByte / hour :slight_smile:

Most likely this is the clue! I’ve had similar problems (with OH2). Removing dead nodes helped a lot.
Look here for instructions on how to do that with Aeotec Stick: Remove a ghost Z-Wave Node from HABmin - #8 by devTechi

I agree with Frank !!!

follow that link, or search this forum, there is software (zentools or something like that, runs on windows) if it can’t be done within OpenHAB (sorry don’t know OH3 very well yet)
A ghost node will bugger up everything and 99% of the time when folks are saying zwave worked great but now it is slow, it turns out to be a ghost node.

sorry, I miss spelled that, I meant parameters…
What I mean by that is that most zwave devices have various parameters, such as there reporting interval. I’ve seen where several devices have been set to have very short reporting intervals (like every 5 seconds or every 15 seconds) and the shear traffic they generate in a large system can swamp it. I thought maybe when you switched sticks, behaps such an interval got changed back to a default or something. Maybe just check them out and make sure nothing is to crazy
Anyhow, search and find the tools you need to get rid of the ghost node and I’m sure that will help. If you can’t find the software or a method, check back in here and I’ll dig around

ummm… watch the timing, watch for if stuff is waiting on other stuff, watch for the controller sending message to node and node not respond or takes a long time to respond. I know general but if you post it, don’t filter it and we’ll take a look

The tool to get rid of the ghost node is called Zensys tool. You can find it here (Post 5): Looking for Zensys Tools - #5 by marc_o

There is also a newer version called ZWaveControllerUI, which is part of “Z-Wave PC-Controller” from SiliconLabs. Unfortunately it is quite tricky to get it: you must to create a free account at SiliconLabs, download and install “Simplicity Studio”, use Simplicity Studio to download “PC Controller” somehow. Very tricky and complicated! Simplicity Studio is everything except simple. If you get the Zensys tool it will do the job.

BTW: Looking into the Z-Wave log will not help. I’ve learned that you can only see the communication between openhab and the stick - which is quite boring, because the delay happens between stick and node. You can’t see this in the Z-Wave log. If you want to see the communication between stick and node you need a “Zniffer”. Getting this is complicated too (you need “Simplicity Studio” :frowning: ) but it provides a lot on insights on what is going on in your Z-Wave network.

1 Like

I agree with the suggestion to remove the ghost node. I have done so myself in the past.
It is also a great time to backup the stick, but do it after you remove the node(s). :slight_smile:

I would say this is a bit on the high end. My java process on both my x86 system and Pi4 is ~5%.
If I were you, I would start by logging into karaf and post your binding list here. Make sure all versions are release current, and that you are not using any .jar drop-ins:

openhab> list | grep Add-ons
225 │ Active │  80 │ 3.1.0.202101300322      │ openHAB Add-ons :: Bundles :: Astro Binding
226 │ Active │  80 │ 3.1.0.202101300322      │ openHAB Add-ons :: Bundles :: Automower Binding
227 │ Active │  80 │ 3.1.0.202101300326      │ openHAB Add-ons :: Bundles :: Daikin Binding
228 │ Active │  80 │ 3.1.0.202101300326      │ openHAB Add-ons :: Bundles :: Dresden Elektronik deCONZ Binding
229 │ Active │  80 │ 3.1.0.202101300333      │ openHAB Add-ons :: Bundles :: HarmonyHub Binding
230 │ Active │  80 │ 3.1.0.202101300335      │ openHAB Add-ons :: Bundles :: hue Binding
231 │ Active │  80 │ 3.1.0.202101300339      │ openHAB Add-ons :: Bundles :: Kodi Binding
232 │ Active │  80 │ 3.1.0.202101300342      │ openHAB Add-ons :: Bundles :: Mail Binding
233 │ Active │  80 │ 3.1.0.202101300346      │ openHAB Add-ons :: Bundles :: Nanoleaf Binding
234 │ Active │  80 │ 3.1.0.202101300347      │ openHAB Add-ons :: Bundles :: Network Binding
235 │ Active │  80 │ 3.1.0.202101300350      │ openHAB Add-ons :: Bundles :: Onkyo Binding
236 │ Active │  80 │ 3.1.0.202101300356      │ openHAB Add-ons :: Bundles :: SamsungTV Binding
237 │ Active │  80 │ 3.1.0.202101300356      │ openHAB Add-ons :: Bundles :: Shelly Binding
238 │ Active │  80 │ 3.1.0.202101300358      │ openHAB Add-ons :: Bundles :: SqueezeBox Binding
239 │ Active │  80 │ 3.1.0.202101300359      │ openHAB Add-ons :: Bundles :: Systeminfo Binding
240 │ Active │  80 │ 3.1.0.202101300400      │ openHAB Add-ons :: Bundles :: TRÅDFRI Binding
241 │ Active │  80 │ 3.1.0.202101300402      │ openHAB Add-ons :: Bundles :: Yamaha Receiver Binding
242 │ Active │  80 │ 3.1.0.202101180339      │ openHAB Add-ons :: Bundles :: ZWave Binding
251 │ Active │  80 │ 3.1.0.202101300404      │ openHAB Add-ons :: Bundles :: IO :: openHAB Cloud Connector
252 │ Active │  80 │ 3.1.0.202101300404      │ openHAB Add-ons :: Bundles :: Persistence Service :: InfluxDB
253 │ Active │  75 │ 3.1.0.202101300405      │ openHAB Add-ons :: Bundles :: Transformation Service :: JavaScript
254 │ Active │  75 │ 3.1.0.202101300405      │ openHAB Add-ons :: Bundles :: Transformation Service :: JSonPath
255 │ Active │  75 │ 3.1.0.202101300332      │ openHAB Add-ons :: Bundles :: Transformation Service :: Map
256 │ Active │  75 │ 3.1.0.202101300405      │ openHAB Add-ons :: Bundles :: Transformation Service :: RegEx

Then, using the bundle:stop you can check which one is using suspiciously much CPU.
Start it again with bundle:start .

You could also have a look in the event log for clues:

$ tail -F -n 200 /var/log/openhab/events.log

Another clue that this could be ghost-node related is that the Z-Wave retransmit timeout is 5s, so you will see delays in multiples of 5s, which fits the 30s you are seeing.

It seems to be showing the loading on the specific CPU core, so as the Pi4 is multicore (4?) the overall loading is about 10%

@Andrew_Rowe @frankb @OMR thank you all for your suggestions about using the ZenSys tool; some initial feedback as follows…

  1. I used the Aeotec version of the ZenSys tool (just to avoid the mentioned hassle with Silicon Labs logins). The Aeotec web page is here, and their direct download here. (note: you need to ignore Windows nag about the installer being unsigned). And just for completeness the Aeotec Z-Stick cloning tool is here.

  2. The ZenSys tool process described here enabled me to successfully remove one ‘ghost node’.

  3. The ZenSys trace logging also identified one device that was flooding the network with power meter reports at ca. 100 milli-second intervals. And in OpenHAB Thing configuration I could change a device parameter to stop that.

I guess that items 2. & 3. above will probably have fixed my problem; but I will let it run for a day and report back with the final answer.

PS I have another question concerning Security Keys: The Aeotec clone tool shows that my Z-Stick has a certain key, and OpenHAB Thing configuration shows a different key. Should these values be the same? Or does it matter if they are different? Indeed are these ‘keys’ referring to completely different things?

1 Like

Excellent!

yes, between the one ghost node (which is itself enough to screw things up) and the one node flooding the network. Am I reading that right that the power meter was reporting ten times a second? Yikes! Glad you found some issues, let us know how it runs
Great job running it down :+1:
Thanks for the links to software, I think I should bookmark those for other folks to find - it is a common problem

Yes. There was a parameter to do with timing or sensitivity; and the manual wrongly said “set to 0 means off”; but in reality it seems that 0 means “as fast as possible”. Go figure… :wink:

Glad you figured it out.

Here is my take on Z-Wave delays:

How to remove ghost devices:

Yes, I think it is fixed now. Many thanks to all who have helped me.