[SOLVED] Node heal fails during UPDATE_NEIGHBORS

This is partly true, but that doesn’t make the neighbour tables inaccurate. The neighbour tables show possible routes, and it’s unlikely that a route will be established if it can’t hear the neighbour.

There are two methods for routing in a ZWave network (and apologies if I’m duplicating something said earlier) - either the controller can set preferred routes, or the device can try to establish a route itself. If the controller defines the routes, then it uses the information in the neighbour tables, and this is why it is useful to perform the heal periodically since this can change due to all sorts of “random” RF “magic” - even if you may not think anything has changed in the house. This is partly why the neighbour table is built with the 6dB margin - to try and ensure that the link is maintained if the power drops by 75%, and is why the binding performs a neighbour table update as part of the heal.

If the node establishes its own route, then this uses a different system (ie explorer frames) to try to find a route. However even in this case, it is highly likely that the information in the neighbour table is going to indicate the routes and it’s unlikely that a route will be established via a neighbour for which there is no link.

1 Like

Just to be clear on what this change addresses… IIRC, the binding has a 75 second timer on the UPDATE_NEIGHBORS transaction. Am I remembering this correctly? And, if so, then your change ensures that that timeout will be used versus the “normal” 5 second timer used on most other transactions.

1 Like

Yep - that’s correct. The timer was only being applied to certain types of responses, so it wasn’t being used in this case.

1 Like

Woot! I can confirm your fix instantly fixed the single node heals for my 2 problematic nodes!

This is great news, been fighting this for quite some time! Thanks a lot Chris!

2 Likes

I guess this fix and this issue on github are mentioned here. Right?

Sorry - I’m not sure what you mean by “mentioned here”? Can you ask the question a different way please? (sorry).

Is this “update” this fix?

Yes.

1 Like

I applied this and after it seemed to work for some hours, tonight’s heal seems to have killed my z-wave network.
Lots of timeouts/unreachable nodes, lots of “CANCELLED” messages, trying a network heal using ZWay software this time but many timeouts.
I’m still busy trying to recover (and don’t know yet what happened)
Be careful for the time being.

The change was quite minimal and it’s hard to understand how it could have made things time out more than it did. Please can you provide logs so I can see what happened - otherwise there’s no way to really comment or understand the issue you had (or fix it).

I’m desperately trying to get the network to work, but still many timeouts. Seems many nodes do not reach the controller any more. Must get it to work first but am out of ideas ATM.

Ok, so you don’t have logs for the issue itself?

In general, the heal is exactly the same as it always was - there’s not been any change to this for a long time. The only thing that changed is the amount of time the binding waits for a response - what it does is the same so it seems a bit hard to understand why it would change things so much, but without logs, it’s impossible to say what happened - sorry.

Just got my home back to work. Well I could extract logs of some hours ago but I doubt they’ll be of much use without me explaining what I did back then, and that I don’t fully remember.
Also, I now believe my issues were due to the heal itself and not due to your fix. Or more precisely related to the fact that I enabled heals after I have been running for a long time without. There have been a number of HW changes and ghost nodes that were still included with the controller. Now that I tidied that up it’s working (I hope so at least).
There’s nothing wrong I see about it that could be attributed to your change.
So excuses for shouting ‘stop’, I think your change is fine to be used (I now run the code, too, but heals to remain disabled for the time being as there’s still potential issues with heals as @mhilbush pointed out ).

1 Like

Added bonus: this fix fixed the network-wide heal for me as well, and consequently it also fixed my ZWave Network view. Great stuff!

1 Like

I make no further comment about that diagram. It is very pretty. Enjoy it.

@mhilbush
@5iver

Could above mentioned fix also solve your issues?

#1195
#1178

Did you try out?

Highly unlikely, as my issue didn’t involve the NeighborUpdate transaction.

1 Like

Is this fix included in the 2.5.2 release? It’s not listed in the changelog

Yes, it was merged a few weeks back so is in the latest versions.

1 Like