Z-wave device initialization and repeated Timeout messages at startup

I started having an email conversation with @chris on this, but it is good information for everyone, so we agreed to bring the discussion here.

Summary of what I am seeing: slow initialization of z-wave devices with many Timeouts in the logs, at binding start up.

The messages look like this:

Timeout while sending
Got an error while sending data. Resending message.
Too many retries. Discarding message:

before finally, minutes later:

Initialising Thing Node…

After the Initialising Thing Node… message, everything seems to work, but this is not proper as the network is solid, and does not time out after this message.

Sometimes i got a lot of 255 timeouts. Rebooting the hole system works perfectly for me. As you mentioned this is only a startup problem. If everything up and running, than its and stays stable.

The logs are your friend - you should check them to see what it means… I think we’ve discussed this in the past, and these timeouts are associated with requesting a NIF - the request goes to the controller, but the timeout is ultimately because the device doesn’t respond…

As @xsnrg said, we were discussing this offline, but I figured others would benefit from the discussion - or my analyses of the result.

Firstly, debug logs are imperative to work out what’s up. In this case, the errors are caused by timeouts to device requests or config parameters - ultimately caused by database errors.

So, to analyse this issue, I would strongly recommend running your debug logs into the log view. This should help you visualise what is happening in a reasonably easy to digest way. From Jims log, we see things like the following -:

This shows timeouts on node 14. We can see that openHAB is sending a CONFIGURATION_GET command - the “Send OK” message means the controller accepted the message, and the “ACK’d by device” message means the device received the command. However, we time out, so either we just happened to miss the response, or, the device didn’t send one. We can assume the latter given this happens multiple times.

This is an annoying feature of ZWave - when a device doesn’t process a request, it doesn’t respond at all - personally I’d prefer if it sent a NAK so we know! Anyway, we can now assume (hopefully correctly!) that node 14 doesn’t have a parameter 2, and we should update this in the database to solve this problem.

We also see the same thing for parameter 100 of node 6 further down, and the same for parameter 255 - in this case, I believe this is a ZW100, and these parameters should be set to write only as they are reset commands.

So, I would urge people to use the log viewer if you are seeing issues like this - it should help to debug any problems in the binding, and the database.

2 Likes

I’m having precisely this situation with my Aeon Labs DSC06 switch. Is there a way to isolate the zwave logs into their own file, or does the viewer just ignore everything else?

@chris can answer more definitively, but it appears the log reader only reports on what it knows how to deal with.

As for the dsc06, I made adjustments to the database entry for the device already. We just need to wait for the changes to make it to the binding now.

The log reader will read through the log, and pull out and process specific messages. So, it doesn’t matter if there are other, non-ZWave messages in the log - it will ignore them.

The reader actually ignores most of the ZWave stuff in the log as well, and it’s actually processing the data received from the stick rather than the ‘human readable’ debug messages most of the time.