[SOLVED] Z-Wave unreliable in 2.5.0.M4

I do with 90% certainty, because I was having this issue on M3, but didn’t have the time to analyze it so I rolled back to M1, and operated for weeks without issue. Then after updating to M4, it immediately started happening again. @5iver appears to also had the same experiences, and there is an active bug open #1195, so Chris is aware of it.

I wish I had the debug logging set before so I could contribute some useful logs to this bug too. But since removing the faulty node seems to have fixed it, I have nothing useful to contribute about it.

File an issue on GitHub so it can be dealt with by a developer. Chris is busy working on Zigbee but there are other devs there too.

There is already an open issue for this issue.

2 Likes

No Robert, you are contributing, thanks, this is how the system works and bringing up the issue and folks looking into helps nail down the issues. Alex pretty much single handedly figured out the problem with the REST documentation that led to a fix. Mark is actively working with Chris on the zwave stuff.
It does sound like since we now know you had a bad node and rolling back to M1 cured the issue (even though the bad node had not yet been discovered) this may be a recent regression

2 Likes

@mhilbush
@Bruce_Osborne

Ok. I got it healed manually. (one node that was not healed since the last nightly heal). And HABmin is telling me LAST HEAL TIME is updated to actual date/time.

But you wont believe what I did.

Set “Heal device” in HABmin and triple click (= inclusion/exclusion) the device many times.

Before that I’ve tested it with 1x click (= wake up) many times with no success.

But if it’s really healed I can tell you within the next few days…

2 Likes

@mhilbush
@Bruce_Osborne
@ariela
@Andrew_Rowe
@rrgeorge
@5iver

Here are my findings:

NETWORK HEAL

node1:

For a better resolution, see here.

node2:

node3:

node4:

node5:

node6:

node7:


.

As you can see all FLiRS (node3, node4, node5, node6, node7) were healed! OK!

@mhilbush
@Bruce_Osborne
@ariela
@Andrew_Rowe
@rrgeorge
@5iver

Now the interesting part: all “non listening nodes” (node8, node9, node10)

node8:



node9:



node10:

For a better resolution, see here.
.

As you can see, they were not healed automatically.

I had to wake them up manually.

node10: There you can see, it first woke up on its own, healing FAILED.

Later I manually woke it up, then it was healed.

Here is the complete DEBUG.log : DEBUG.log (654.2 KB)

So you can load and filter it on your own, here:

https://www.cd-jackson.com/index.php/openhab/zwave-log-viewer

PS: ALL images were readable (big enough) at my PC, but they were zipped during upload, SORRY!

Explanation:
healing started at 21:20, controller tried 5 times with 3 requests. They all FAILED. The 6th try was successful then. The node1, node2, node3, node4, node5, node6, node7 were healed until 21:23.

Node8, node9, node10 were not healed within the next 30 minutes. Then (21:50) node 10 woke up on its own (=wake up interval), but healing FAILED and it was getting OFFLINE.

After this event, I woke up all three left nodes (8, 9, 10) manually and they were healed.

node9: start: 21:57 ; end: 21:58
node8: start: 22:05 ; end: 22:05

node10: I woke it up 2x at 22:02 and 22:11 ((in the meantime a lot of motion/tamper were detected, because I was in front of the sensor or I moved it a little bit, to press the button!))
node10: start: 22:02; end: 22:11

Apologies I’ve not had the chance to look at this as I’m completely overloaded with other work for the next couple of days and am then away from home until mid November so just don’t have time until then.

3 Likes

node1: (better resolution)



node10: (better resolution)






If there is something you want Chris or the other devs to address, open an issue on GitHub.

There is already an issue reported on Github. Just wanted to prove that I have the same problem and that manual healing (by waking up the node) works. Maybe some people can discover something in the logfile that may lead to the solution.

3 Likes

Guys,

may I ask you to understand better the issue? I mean, in my setup (last snapshot, 2.5.0~S1733-1 (Build #1733)) I see all my zwave devices successfully healed at 2.30 AM as expected. And it seems everything’s working correctly.

What is the symptom I need to look at?

thanks again for your patience
Andrea

@ariela

Do you have “battery powered” and “non listening” nodes? You can see it in HABmin, select a node → properties (down to the bottom), where could see: last heal, routing, beaming, frequently listening, listening, last wake up, neighbors. If your nodes are neither “frequently listening” nor “listening” then you could have also this issue.

This doesn’t look like the issue I reported in #1195, and with additional documentation in #1174.

The issue I reported in #1195 has a very specific log file signature as shown in this log file snippet. I don’t see that signature in the log file you posted.

@mhilbush

Ok, thanks. :slight_smile:

It seems to be different, but the result is the same. The nodes weren’t healed.

I’ve never had this before, but after update from 1486 (02 Jan 2019) to 1731 (20 October 2019).

I have 5 Danfoss Thermostat valves … battery powered, not listening and even not frequent listening. But those valves are working correctly, I mean no errors/issues in my logs. Last heal is always as expected

What is the issue?

Then you don’t have any.

Did you make an update from xxxx to 1733 or an installation from scratch?

Last updates:

  • from 1657 yo 1712
  • from 1712 to 1731
  • from 1731 to 1733