All Z-Wave nodes being marked dead

Hi all,

I’ve got an issue that’s been driving me crazy for the last few months. Prior to this, OpenHAB and my Z-Wave network were working great. Sorry for the wall of text, but background helps here.

My environment at the time:

  • OpenHAB 1.6.2
  • Z-Wave binding 1.7.0 snapshot (got it from a thread to get a scene controller to work)
  • Aeotec Z-Stick S2
  • Dedicated machine running Server 08 R2 with Java 8U31

I still have to see if my current network has the same issues without it, but this seemed to start a few days after adding an Aeotec ZW089-A recessed door sensor. One morning I went to hit a button on a scene controller and it gave me the ‘failed’ blinking lights. I checked my OpenHAB server and the log showed everything as ‘dead’. OpenHAB had red icons across the board for everything. I closed it and fired up zensys tools to see what it would say about the nodes. Basically, it seems like the Z-Stick could send commands out to devices (dimmers and other binary/multilevel devices would respond to a basic on/off), but the Z-Stick alleges it wouldn’t get a response from anything. The packet log in zensys showed only transmitted activity and didn’t reflect any updates from my Aeotec multisensor, door sensor or scene controllers–basically no incoming data.

Reinserting the stick and rebooting didn’t help so I did a factory reset on the stick and included all my devices again. It worked for a short while then the exact same thing happened again. I tried updating OpenHAB and my bindings to 1.8.2, but the issues persisted. Assuming it was a hardware issue, I eventually opened a ticket with Aeotec and they helped me update the firmware and reset it again after showing them the results from zensys-tools. They ended up sending me a refurbished Z-Stick S2. The replacement worked for a few days and started showing the same issues!! I didn’t have time to muck with it for a while, but this week, I reset it and added only one dimmer and used a different PC to test–same issue: commands get sent, responses don’t come in. No node information requests, no command acknowledgements, nothing. Reset and included it again a few times to be sure–no dice. I figure it’s possible to have one Z-Stick die and an old refurb die along with it.

So, I bit the bullet and got a new Z-Stick Gen5. Last night I added everything and ran the new diagnostics utility from Aeotec–it reported back saying the network was healthy and all nodes were talking. I fired up OpenHAB and everything lit up green and was working great–even the multisensor and door sensor…

…Until this morning. Scene controller gave me the angry blinks again and, logging into the server, all nodes have failed and were ‘marked dead by the controller’. Zensys-tools shows the same behavior and the new Gen5 diagnostics tool has every node failing its tests.

My setup is pretty small and consists of 4 in-wall dimmers, a plug-in dimmer, 2 scene controllers, a thermostat, a SmartStrip, a siren, a multisensor and a door sensor. All of these devices are withing range of each other, with the exception of the door sensor, which still connects directly to the controller.

At this point, I’m not exactly sure of what I should do next. I don’t see how OpenHAB could be causing the issue (especially through different versions), but I’m thinking of trying a different software to at least eliminate it. Anyhow, I’m most interested in seeing what I need to do to troubleshoot within OpenHAB. How should I proceed?

Thanks in advance,

Ed

1 Like

Alright, so far the network is stable without the recessed door sensor. It’s been up and running for a few days now!

Since the breakages seem to occur at night, I’m guessing that the device doesn’t like the Heal process. What would be the best way to test this? I have a spare Z-Stick S2 now, so setting up a test system just for this should be pretty simple–I’m just not sure of the best way to provide useful data to the Z-wave folks here. I haven’t gotten a reponse yet, so I’ll keep trying to eliminate what I can.

Does anyone else have any trouble with these sensors? Did I get a dud?

I would get a debug log and check what’s happening using the log viewer.

It may not be the heal that’s causing it, but simply that the heal is the only time in the day that the controller communicates with the device… A look at the log would definitely give you a better idea of what’s happening…

Thanks Chris,

I’ll look into getting debug logs. Any idea why communication with one node would cause the network to collapse like that, though?

It’s hard to say - there are a lot of factors that can influence wireless communications. If the ‘one device’ is a significant point in the mesh network then it might be a major issue if it stopped working.

Something to bear in mind is that wireless networks aren’t necessarily symmetrical. So, just because a node can send to another node, does not mean that there’s a return link. Many things influence this, but consider the case where a device might transmit slightly more power so it can be heard by more nodes. If the return link uses lower power then you have an asymmetric link.

There are many other things that can cause this. The answer in a mesh network is to ensure you have more nodes than you need so there are different links that can be used if one route goes down for any reason…

I’m not sure if that helps, but these sort of wireless links certainly come with their challenges and they need to be able to cope with interferers - this is one of the license conditions for using these unregulated bands…

Cheers
Chris