[SOLVED] Unresponsive Z-Wave Network: Tools and Approaches to track down the issues

Analysis of issues within a Z-Wave network

I am sure many of us that run bigger z-wave networks have already or will eventually get into some challenges. Over the past months I had to track down issues myself until finally my z-wave network was so unresponsive that it took up to minutes until an actor would react on a command which basically drove my nuts!

Thanks to a lot of community ideas and in particular @OMR (and a superb support guy from aeotec named Chris!) I found a lot of ways to analyse, track down and improve my network. As I learned so much during this time I thought I should share my experience. Feel free to add yours in the comments.

I even found out that a zwave stick shift turned out to be much simpler than I had expected (at least in my case).

So without further ado, here are my learnings and recommendations.

I a nutshell what I have learned

  • My zwave network has become totally stable now and responsive after

    • I moved to a new Aeotec Gen5+ stick and
    • I removed all dead notes
  • Stick Shift: You can easily backup a noname stick and restore it to an Aeotec stick

  • With the aeotec stick you can include a device by just pressing the button on the stick which it is powerless and non-connected and then do the inclusion (pretty fancy…)

  • You are better off deleting dead nodes from the network that are not working anymore

  • Some devices (like some older rollershutter fibaros) are a nightmare to exclude (just doesn’t work), so you have to reset them and manually delete them from the zwave network (several support messages with Fibaro didn’t fix the problem)

  • Zwave battery devices still are a nightmare until you finally get them working… It is very helpful if they provide manual wakeup, manual reporting and manual NIF-reports (like the Aeotec Motion Sensor does though even in this case it took me days to get it included). With the sensative strips I have basically given up! Only buy battery devices if you do have time and patience :wink:

  • During inclusion in particular with battery devices you may end up in a REQUEST_NIF situation. I learned that some devices (like my motion sensor) actually allow sending a NIF (A node information frame is essentially the beacon that a node sends out at the start of the inclusion process) which helped my to get further through the inclusion process.

  • If a (battery) device says it is offline try healing it and then wake up the device explicitly (see device manual how to do so)

  • If you like to see what is happening on the zwave network purchase the the Suphacap | Suphammer . However I have to admit that I hasn’t helped me yet to really identify issues but rather only learned about Zwave.

  • Use the zensys z-wave pc controller tool to delete nodes. Here is exactly how to do that:

    • Highlight the node you believe has failed from the top left frame’s node list/table
    • Toggle the “Queue Override” checkbox for this selected node (a check should appear)
    • Send a NOP (i.e. no operation) command by selecting the exclamation mark from the tools menu
    • Check to see if the node is failed by selecting the “is node failed” icon from the tools menu. If it has failed, you should see a message stating it has failed in the lower right frame (i.e. log actions tab)
    • Select the “remove failed” icon from the tools menu
    • If you don’t immediately see the node gone away, don’t worry - it is: shut down the zensys tool and restart and in 95% the node has gone away (that drove me crazy the first time).
  • If you want to “talk” to your device with the zensys tool, first mark the device on the check box (overwrite), then press the Get Info Icon, only then you see all commands that are available for that device.

  • There is actually a manual available for the Zensys PC controller that is worth looking into it (see links below)

Helpful and updated links

14 Likes

I have not used the Suphammer, but based on the screen shots, the zniffer setup (that I use) seems more readable. For me it just required a UZB ($35 USD) and the free Silabs software. I found the instructions here. Output looks like this:

Also great post. Glad your issues were resolved.

Lastly I also had great luck with Chris from Aeotec in the past. Hopefully that is a real person and just not just a screen name for all their tech support.

Bob

1 Like

Thanks also for this post, it (finally) made me install the UZB I have been having lay on my desk.
Now that I did, I’m seeing many messages from node 000 with CRC error, maybe 1 in 20 but sometimes there’s batches.
Is that RF noise ? I have another system running on 868 MHz. It is not supposed to create that much traffic though as it’s battery powered nodes only, only ever sending when they cannot avoid.
Has anyone an idea how to find out ?

It’s hard to say. I get a few. I believe most are frames the Zniffer did not fully get based on it’s positioning and/or weak signal. Mine is right next to the zstick. Some can be deduced from context. For instance;

Node 75 is actually outside of the basement at the far end of the house. It sends the outdoor humidity and temperature hourly. Note the low RSSI on the CRC frame. However, since there are two hops line 122 is likely the multisensor report going from 58 to 60 (like line 127 -that one got picked up) and line 126 and line 133 are likely a missing ACKs from 58 to 75. Looking at the OH UI both measurements got through.

CRC in zniffer are normally just out of range to the zniffer stick. If you consider some of your nodes need to have a repeater to communicate across your network it is not surprising to see CRC from those nodes outside the normal zwave range of the zniffer. Best to use zniffer in a laptop so you can move around to get the traffic from different places.

That said worth checking the RSSI column as if this is showing a good signal and still CRC issues it could be a bigger issue. I have seen traces with many CRC in networks with older devices and some FLiRS devices and on investigation the CRC were issues with the beaming and the node 000 rings a bell with that issue. Do you want to post some traces?

Of you look at the HEX of the CRC you can sometimes work out what is the non corrupt command and what got mixed up.

Yeah I have done that too. I do not get many, just happened to be zniffing when I saw the post. Of my 46 nodes, 36 are direct shots, 4 need a hop and 6 are 2 hops. For me, never really see CRC on the close-in nodes and it is always with low RSSI.

Maybe surprisingly :wink:, I have a desk top PC, so can’t go wandering around. My setup

Bob

great thread, thanks @stefan.hoehn you are a rock star :+1:

1 Like

Bob,

Yes, Chris Cheng is a real person who besides Sebastian doing the the support - I asked him he confirmed that :slight_smile: Probably one of the best support I ever got.

1 Like

I tried to download the Software as described in your linked page (I registered first)
grafik

but I always get the following message first:

“Please download either a Controller SDK or Embedded SDK before downloading Z-Wave tools. Contact support if any queries.”

How did you get around this? (or do I have to download Simplicity Studio?)

cheers
Stefan

One of my problems is that I don’t take great notes. I vaguely remember something like that and poked around the site and may have downloaded a SDK, but I did not pay anything, that’s for sure. Also I had Simplicity studio installed some time ago. It has the Zniffer app, but you need the programmer and it does not have that. Since I have both a standalone Zniffer and one in the Studio, I must have gotten the download to work, let me see if I can recreate/recall/find what I did.

Bob

edit: So I think I downloaded the SDK here. I downloaded another copy and it looks like what I have.

Were you able to use the Aeon Backup Software to backup yould old Non-Aeotec stick?

Yes, that was what surprised so much. It worked like a charm.

That is indes surprising…which Stick did you have?

I replaced / shifted the ZMEE UZB1 to the Aeotec Gen5+:

Thanks Bob, I found out that the issue is actually that the tabs, hence the download tab, doesn’t work in firefox. So one has to use a different browser. I added the Silicon links to the top of the thread.

Btw, I definitely recommend to read the “Z-Wave networking Basics” pdf that comes along with the SDK download.

Even though the installation process is described in full detail and nothing is missing, here are some tips to become succesful in doing it as the process is far from being intuitive. :wink:

  • Reserve time - it will take some minutes!
  • Download the SDK - as far as I have noticed you don’t really need it but it needs to be done to be allowed to download the other two downloads
  • Then download the sniffer and the programmer. Unzip both.
  • Have patience, read carefully and follow the instructions step by step…
  • I needed to start ZWaveZnifferUI.exe (not ZWaveZniffer.exe)

Finally the nerd factor is pretty high when it works :nerd_face:

1 Like

Good Post. Should be helpful to others !

  1. About a year ago @robmac wrote a series for the OH community on Zwave networking Basics that was based on the SDK .pdf you noted and included useful advice and observations relating to OH. I have referred back to it several times. It’s a good read also.
  2. Agree it can be addicting. When I started zniffing a couple of months ago, I would walk around and trigger sensors and use the UI to trigger switches just to watch the frames.

Bob

1 Like

Thanks for posting this.

I’m struggling with how to recover nodes 2-5 on this stick. Is there any way except by resetting the stick?


Nodes 2-5 were test devices that I subsequently removed but the node numbers can’t be re-used?

Using PC Controller you can remove the zombie nodes. I think it is up to the controller whether to re-use the IDs though. Removing the nodes is recommended to maintain network health.

Short answer is no without a factory reset. I have read that once the maximum number is reached ( and it’s somewhere in the 230-255 range, new Nodes will be assigned in the unused spaces (like 2-5) in your case, but I can’t verify that. I have 47 nodes and my last one is 79, for example.

Bob