[SOLVED] Z-Wave on 2.5.1 Broken

That would all be good but for hour on end your config is talking to one of your secure nodes every five seconds

and on

and on and on

and they do seem to always take a lot longer than my secure nodes to return a nonce so and that is from the controller. All those seconds when the controller is waiting it can send no more commands.

In general terms you also have an odd node that occasionally spams your network but this traffic is for hour on end.

And then because your network is so busy incoming reports and outgoing requests

You get a load of CAN
As you are an IT pro.

I get 0 CAN with 150 nodes other than 1 or 2 at system start.

If you are getting these your configuration and network health needs looking at. There is nothing that can be done about these. The controller is a serial connection trying to do async communication and there is nothing wrong with how the binding is controlling this.

Seriously get yourself a zniffer to see the actual packets and tune down your traffic and you will have a more robust network.

I think only locks are supposed to self report when the physical mechanism latches but I do the same as you and if it is not as important as a lock. I trust if it is not going dead there was an ACK so it probably worked.

Letā€™s face it nothing is full proof. Non of the relay devices on the market actually check that the relay made or broke physically. Power monitoring may tell you but if the relay is fused shut even if you poll the device and it says off, your lights are on so like you I take the risk.

and from the people that build the chips

Z-Wave is a radio technology with limited bandwidth. Therefore, it is NOT RECOMMENDED to use polling.

1 Like

This is slightly off-topic, but I have a question out of pure curiousity.

Even though a single z-wave controller can handle many nodes, is there any benefit to installing two controllers to separate z-wave devices into zones and/or categories, which would reduce the burden on each controller? Or would that be a net negative since the binding still does the heavy lifting and youā€™d be running two z-wave networks that may interfere with each other?

This may have been discussed elsewhere, but I didnā€™t find anything in a quick search and I imagine others might have the same thought after reading this thread. At the very least, perhaps it would make it easier to identify z-wave network problems by isolating issues to a specific controller.

It is sort of relevant as someone with the issues described here may imagine it will all fix with another controller. This is far from proven.

Certainly two controllers in close proximity do not like each otherā€™s network but if you can seperate them then you can split the traffic but even then it may not be great.

A few fibaro users use a feature that coordinates via IP network to run two controller. Say a HC2 and a HCL. If these are at the far end of a large property it does sort of work but if they still pump too much info around the networks then it still all falls in a heap.

There are also a lot of other issues with the fibaro solution. I tried it for a bit but eventually just got down to reducing traffic. The extra complexity is just not worth it.

A slave controller is even more of a PITA. All sorts of issues not least that devices only have one lifeline.

The best I have achieved is working within the limits of a single controller.
My method is:

  1. get a zniffer
  2. reduce reporting. If you do not need it turn it off if you can. Some devices do not allow this or are buggy.
  3. try to limit to about 10 commands a second max other than for very short bursts. If you have the majority of nodes in direct range you may get a little more. Lots of hops and you will find this is a lot. The binding will que them up for you but have a big que of outgoing and a load of incoming reports is not great on that serial interface. Giving things time to happen avoids pauses when you want instant action.
  4. no polling if you can get away without it.
  5. do not rebuild regularly
  6. only secure when neededā€¦ secure is more traffic
  7. extend wake as far as you can live with. Better battery life and you still get the reports.
  8. try your controller in a different placed if you can not make it work from where it is
  9. get a zniffer. It is cheap and very effective

Others may have a different recipe and there is no general solution as this is radio in your home. Your home is different to mine and you have different devices and different needs but the above are a set of general rules that a lot of people have tried and they work.

The more nodes the more rigorously you have to apply them.

The more routing the more rigorously you have to apply them.

1 Like

Thanks for the detailed response! Even if others have different recipes, I think these are good tips for reducing traffic.

I find this interesting. I only have two ZWave devices, so traffic hasnā€™t been an issue for me. My lock is secured, and my motion/temp/humidity sensor is not. However, the sensor offers an option for secure inclusion. I just didnā€™t feel the need to do that, perceiving that it would be more of a pain if I have to replace my controller.

A lot of modern devices do.

It would be nice to be able to just have everything secure but as you see from the logs above that is extra commands so there is a compromise even if like you it is just you want to keep it simple.

Any suggestions on best approach? Googling around here isnā€™t immediately clear on where and exactly what to procure. Is there a setup or a kit this group would recommend thatā€™s known to be decent?

There is a thread but it is long and rambling. For now Thomas alias ~@tinman has written one on the fibaro forum.

He is one of the best pro installers around.

The link is halfway down that link I gave you but here it is again

https://forum.fibaro.com/topic/29923-tutorial-z-wave-diagnostics-with-pc-controller-and-zniffer/

If you have a spare controller then fine if not digikey or mouser UZB3 for your region is as good as you can get.

https://www.silabs.com/products/development-tools/software/z-wave

1 Like

OK, thank you! That makes sense now. I was originally understanding a Zniffer to be a physical, specialized piece of hardware, I didnā€™t realize it was a firmware update to a Z-Wave controller that enabled it. Got it now, Iā€™ll pick up a UZB3 and give this a shot in the coming days. Thanks so much for your help and patience.

Iā€™ve narrowed this issue down to 3 ā€œfailedā€ nodes, that when interacted with in any way via the Z-Wave controller, the controller freezes. I canā€™t even remove these nodes from the network as the command to remove the failed nodes causes a Z-Stick hang. This is reproducible using OH2 and the Z-Wave PC Controller.

if you have a widows machine please search for @robmac 's tutorial on how to use PC controller software to get rid of zomie nodes

Good news you found them.

If they show in the list in PC controller then in the network management tab select the failed nodes. Click the mark as failed button. Then remove failed. They will be gone from your life.

I think this also works in habmin mark as failed and remove failed though I have not needed to do this.

That is supposed to work but I have not had good success trying that.

It does exactly the same thing as the PC controller to remove a device. The only question is if the device is in the failed nodes list - if the device fails to initialise, then it should be in the failed nodes list.

The only thing the PC controller does differently is to send a command to the device immediately before removing it to ensure itā€™s in the failed nodes list. Otherwise itā€™s the same. I performed some testing on this recently as part of the new binding development and it always removed the node.

1 Like

I have not tested this since 2.5 was released as stable.

It wonā€™t have changed. The question really is if the device is considered FAILED by the controller - it should be if itā€™s not responding and the binding is trying to initialise, but thereā€™s no guarantee of that unfortunately.

The new binding completely changes the way transactions are managed which makes it easier to sequence things up which means I can send the NOOP command required to put the device into the failed list, but the new binding is a while away yet :wink:

4 Likes

If only it were that easy. This step is where the Z-Stick hangs. All other nodes seem to be working great, but on these three, any kind of command, including the NOP sent to validate if the node is online, causes the Z-Stick light to stop flashing and it hangs. I have to power cycle the stick by removing it to get it to work again. I need a method to either delete these nodes without sending them any NOP or command, or perhaps this Z-Stick is defective.

Iā€™ve ordered a new Zooz Z-Wave USB stick as this is the second Z-Stick to have stability issues like this. I am going to try to add the new USB stick into my network and transfer the roles over to it, to see if I can preserve my existing network and config but then purge those broken nodes. I should know on Friday.

If anyone has ideas on how to forcibly remove these nodes when the Z-Stick hangs, I would definitely like to learn though.

That is bad luck.

I am guessing you are doubly snookered as you probably are not lucky enough to have a backup of the NVM as you had not known about PC Controller before this happened.

I am guessing this is an Aeon as you talk about lights?

I used to use an aeon a long time ago and think NVM can get corrupted if power goes.

Are the devices still functioning and it is an aeon?

A bit of a long shot.

The only good thing about the aeon IMO is you can take it out of the USB and as it is a controller with a battery so you can go and add / remove without it plugged in. You might have known this but a lot of people do not.

Take it to each device that is failed in turn and hold the button down on the controller stick and put it in exclude mode (fast flashing yellow led). I think a few seconds . Activate the devices as they describe in the manual and if it is removed the stick will flash the lights differently to confirm.

Possibly this will remove and tidy up as it might just be a corrupt record in the controller.

If they are already removed or dead this is not going to work but worth giving it a try.

If it works then back it up.

1 Like

The devices are actually dead or in one case, have been removed. It is an Aeon Z-Stick Gen5. I am hoping my new Zooz that arrives today is a bit more stable and I can transition to that. My UZB3 also arrived today so I can setup my Zniffer.

Well light at the end of the tunnel.

You will enjoy your zniffer.