[SOLVED] Z-Wave unreliable in 2.5.0.M4

Hi,

It seems that something has changed in the Z-Wave binding since at least 2.5.0.M3. I noticed that in 2.5.0.M3 devices would randomly go offline without explanation, and would only come back after I manually performed a network sync. Every most devices would go offline, and when I’d wake up in the morning my z-wave light bulbs will be turned on, and openhab would say they’re offline. So I rolled back to 2.5.0.M1, and have had no issue for weeks. Yesterday, I saw that 2.5.0.M4 had been released, so I updated my system to see if this issue has been fixed. It has not. I noticed after a few hours, one (little used) device went offline but I figured it was an isolated instance because everything else appeared to be working. This morning I woke up to find my bulbs turned on, and offline like before. I check openHAB, and a bunch of my devices are marked offline again. Now, I’ve tried to search for this issue before and I couldn’t quite find something similar… I would like to try to resolve this issue, rather than forever being stuck at 2.5.0.M1, so if any one has any idea what could be going on, I’d love some input.

Do you really mean 2.4.0.M4? Or do you mean 2.5.0.M4 ?

2.4.0.M4 was released on 25 Sep 2018.
2.5.0.M4 was released on 20 Oct 2019.

2 Likes

Also, 2.5M4 has been stable for me as was 2.5M3. There were some snapshots that had z-wave binding issues but that has been resolved for about a week or so.

Do you excatly know since which snapshot it has been resolved? Before or after 1731?

Whoops. Yes, I meant 2.5.0… I have edited my original post. Thanks.

1 Like

Not sure. Here is the PR that fixed it.

1 Like

It seems to all be related to the daily network heal (currently configured for 2AM)

2019-10-24 02:37:27.380 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node23' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.387 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node3' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.395 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node28' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.402 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node17' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.409 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node16' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.416 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.423 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node14' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.434 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node9' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.441 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node7' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:27.448 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node6' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:37:52.520 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller to ONLINE
2019-10-24 02:38:00.237 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:38:22.520 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller to ONLINE
2019-10-24 02:38:31.795 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:38:52.520 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller to ONLINE
2019-10-24 02:39:30.853 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller
2019-10-24 02:39:53.039 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node15' changed from OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller to ONLINE
2019-10-24 02:39:59.080 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:d662a132:node9' has been updated.
2019-10-24 02:40:02.111 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:d662a132:node9' has been updated.
2019-10-24 02:40:02.114 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node9' changed from OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller to ONLINE
2019-10-24 02:40:05.049 [hingStatusInfoChangedEvent] - 'zwave:device:d662a132:node14' changed from OFFLINE (COMMUNICATION_ERROR): Node is not communicating with controller to ONLINE

Any phantom or ghost nodes in the network? Devices trying to route through non-existent nodes can cause many random network issues. The bad nodes need to be removed from the Z-Wave controller to resolve this.

I have one node doing this now for me but I think my issue is marginal W-Wave signal at that location. Mains powered nodes or repeaters can help resolve issues like that,

It was fixed on 17 Oct 2019. Snapshot 1731 was released on 20 Oct 2019.

Having a similar (or the same?) problem, so it’s not fixed.

Please open an issue on GitHub then. That is how the PR I linked to finally got resolved. Chris is extra busy right now with Zigbee and posting on GitHub lets other developers try to help.

1 Like

I have the SAME issue!

1 Like

Nope.

All of the affected devices are mains powered.

Mine are battery powered…

See also this topic.

I’ll do further investigations (when I’m back from work :wink: ) . After that I’ll open an issue on GitHub.

1 Like

The one device giving me headaches at healing is mains powered.

Correction, 1 of them is battery powered.

1 Like

I have 19 zwave devices with openHAB 2.5.0~S1733-1 (Build #1733), no issues at all till now. May I ask you to explain a bit more?

Andrea

You got me thinking about this, so I investigated further, and found 1 device that I couldn’t get to respond properly. I tried to mark it as failed and remove it, and I kept getting errors, I eventually had to restart the ZWave binding to get it to remove the old node:

2019-10-24 08:13:42.645 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'zwave:device:d662a132:node8' to inbox.
2019-10-24 08:13:59.308 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node not placed on stack due to error 0x18.
2019-10-24 08:13:59.572 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node not placed on stack due to error 0x18.
2019-10-24 08:13:59.834 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node not placed on stack due to error 0x18.
2019-10-24 08:15:30.257 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'zwave:device:d662a132:node8' to inbox.
2019-10-24 08:17:06.254 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node returned with response 0x3.
2019-10-24 08:17:43.762 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:17:44.021 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:17:44.283 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:27:07.726 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'zwave:device:d662a132:node29' to inbox.
2019-10-24 08:27:20.161 [INFO ] [ommandclass.ZWaveVersionCommandClass] - NODE 29: Command Class COMMAND_CLASS_BASIC has version 0!
2019-10-24 08:30:11.240 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node returned with response 0x3.
2019-10-24 08:32:31.994 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:32:32.252 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:32:32.514 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:03.360 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:03.619 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:03.880 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:13.392 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:13.644 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:13.903 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:33:14.161 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:33:14.420 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:33:14.683 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:33:23.950 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:33:24.467 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:33:24.722 [ERROR] [message.RemoveFailedNodeMessageClass] - NODE 8: Remove failed node failed as Controller Busy!
2019-10-24 08:37:28.591 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:37:28.851 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!
2019-10-24 08:37:29.108 [ERROR] [essage.ReplaceFailedNodeMessageClass] - NODE 8: Replace failed node failed as Controller Busy!

I have also set my network to heal at 9 am to see if this resolves the issue.

sometimes it’s useful also to remove the device from the USB key. What controller are you using btw?

Another thing to do:

  • stop openhab

  • remove the file with the zwave node in the name in the zwave folder (/var/lib/openhab2/zwave) -> something like network_xxxxxxxxxx__node_8.xml

  • start openhab of course

1 Like

The Z-Wave network is stored on the controller. Deleting the xml file achieves nothing useful. I have put my USB controller on a Windows box & used Zensys Tools to delete phantom nodes.

I now suspect I have dome phantom nodes and will likely try OpenZWave Control Panel on Linux, with OpenHAB shut down.