Very slow ZWave network

In that case I suggest to attach a longer debug log.

@stefan.oh You may have missed the one I posted two days ago in response to your request. Does it not contain enough?
I can redo another one with the latest situation (two nodes were taken out of the network)

The log from 2 days ago would only be a valid starting point if no changes were made to the setup. But I was under the impression that there were changes. That was the reason for me to ask for a more recent log.

Iā€™ll turn on the debugging and will post a log once it is long enough

@stefan.oh Here is as long a log as the 1MB upload limit will permit.oh.log (1004.9 KB)

Splendid :+1: The log covers app. 9h of time, that should give us an idea of what is happening.
Iā€™ve uploaded the log into the Z-Wave Log Viewer that Chris provides on his website.
Looking at the Filter option I count only 27 nodes that take part in the communication. Your network map shows way more than these 27 nodes. Also in the network map I see #76 as the node with the highest number, but in the filter view of the log viewer there is a node #255. Filtering for this node does not give any hints, maybe this is just data garbage that cannot be associated with a single node.

Filtering randomly for one node, #61, shows some traffic. Temperature is reported, but it seems that you took care to avoid traffic: only changes of at least 0,5 degF are reported, or multiples of that. Thatā€™s good, avoiding unnecessary traffic is one of the top priorities in a network with limited bandwith but a lot of participating communication partners.

And to the next random node, #28. Only ONE message is shown. Either it is a node that has not much to report or it has no chance to do so. I cannot tell.

Node #8 is similar, only 2 points in time it is seen in the logs.

On to node #75. Now it gets interesting. There are several messages ā€œDeleteSUCReturnRouteā€. Is there more than one controller in your setup? What are the settings for your controller(s)?

Node #48: Only one point in time this node communicates. But it is shown that it took over 20 seconds to receive an acknoledgement from the device. Only 3 other nodes were communicating at 16:41:xx. That does not look like congestion of the network.

Node #55: Between 7:54 and 9:39 the node does not report anything, there are messages containing ā€œDeleteSUC ReturnRouteā€. But at 16:22 it starts reporting kWh, beginning every two minutes, then it becomes more erratic.

I cannot pinpoint something (to me) obvious. But Iā€™m curious why a lot of nodes shown in your network view are not seen in the logfile.
Looking at the file without the logviewer at Chrisā€™s website I see messages like

2020-12-27 08:04:32.099 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 23: SECURITY not supported
2020-12-27 08:04:32.099 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 23: Command Class COMMAND_CLASS_SWITCH_MULTILEVEL is NOT required to be secured
2020-12-27 08:04:32.102 [DEBUG] [ng.zwave.internal.protocol.ZWaveNode] - NODE 29: SECURITY not supported

Are your devices included with security enabled? That puts a strain on all the processing, in the devices as well as on the controller. Maybe that is a hint.
Did you check the load on the system where OH is running?

2 Likes

This morning, I discovered one thing that may have caused the network to relapse.
I had unplugged a ā€˜smartā€™ power strip. Itā€™s node #55 that you did spot. When I moved thing around before the trouble started, a printer was unplugged from that strip and another one plugged in. So I thought it could have been a contributing factor in the network problem. After a heal everything went well.
Yesterday night, I discovered that my spouse had plugged that strip back inā€¦ So that seems an indication that it might be a trigger of the trouble.
I unplugged it again and started a heal 43 minutes ago. The network is not back to normal yet. The heal takes a long time.

node75 is not something I see configured nor is it in the inbox.
node8 is a relay switch.
node28 is a dimmer and is physically the closest to the NUC/Zwave controller stick
node23 and node29 are dimmers. I have no idea why they show up with security.
I do have two door locks that have ZWave but I never paired them, so they should not appear in the log.
I have a memory that node255 represents the controller itself. I got this impression from some tinkering a long time ago and could be wrong.

As a side question. Issues of slowness have been dogging me throughout the use of OH/Zwave for about 5 years now. Iā€™ve added Zigbee devices but have much fewer of those than ZWave (4 lightbulbs and 2 switches). Is zigbee also prone to issues and I am just lucky, or is it a more reliable network/protocol?

The system running OH has very low load. Its only use is to run OH.

@chris is our expert developer for both those bindings so I will let him comment.

The controller is usually #1 as far as I have seen it in the past. In your network map you will notice it is painted a bit bigger than the other nodes and has an orange ring.

I am a bit confused about the network map you posted on the one side and other information on the other side: the log does not show traffic for a lot of nodes printed in the network map. But the log shows entries for a node #75 that is NOT shown in the picture you posted. Is it one and the same installation we are looking at?

Sometimes it showed up as 255, for whatever reason ā€¦

Another discrepancy to the picture with the network map. There is no node #255 but node #1 is there. Or is the controller #1 AND #255? Maybe :hushed:

2 controllers? :confounded:

Yes, if this rare situation happens it is both ā€¦ one controller, two node idā€™s.
But I am not able to explain that :grinning:

I think that is the situation I am having. The controller is #1 AND #255. Maybe itā€™s related to my problem.
I am not considering buying a new ZWave stick and transferring the settings from the one I have to the new one. Iā€™ve seen an inexpensive SLUSB001A at digikey. Maybe itā€™s worth a try, though I have not seen much said about that stick.

That is a newer 700 Series stick ( Z-Wave Plus 2) I am not sure it would work with the current binding designed for the 500 Series (Z-Wave Plus) sticks.

1 Like

I donā€™t think so. In the past when this happened it was not the root cause of problems.
Why not going with @stefan.oh analytics from above:

Get rid of all the nodes which are not present anymore, in addition to that make a soft reset of the stick (shutdown the server, remove the stick, wait a couple of minutes, reinsert the stick and fire up openHAB)

1 Like

Correct, confirmed from Chris:

1 Like

I donā€™t think that is the issue. There are gaps in the node numbers because of having had to either replace or re-pair devices. But I do have all the nodes shown in the network and OH shows them all online. Iā€™ve removed one device I thought might be the cause. For a while and after a heal it seemed to have worked.
It was plugged back in my spouse, Iā€™ve redone the procedure and it is still not working.

What I really do not understand, fundamentally, is that the Zwave stickā€™s lights color change frequency goes down dramatically when the network is in trouble. Doesnā€™t that mean that on the stickā€™s side something unusual is going on? The log slows down as well dramatically. Is there a way to see what the stick is doing when flashing slowly?

Many times the nodes do not get completely excluded from the network, leaving zombie nodes on the controller which can mess up the network routing and cause performance issues.

How do you remove them? There is the node75 which I do not know at all why itā€™s there. It does not appear in the list of things nor in the inbox.

Edit: node 75 has now appeared in the inbox. Does hitting remove there remove it from the stickā€™s setup?