ZWave binding updates

mhilbush · January 28, 2019, 12:52pm

No, there’s more to it than that.

Once the binding identifies the device, it queries the device for its command class versions, configuration parameters, associations, etc. This can sometimes involve quite a bit of back and forth communication between the binding and the device. For some battery devices, it’s not uncommon to have to wake them up multiple times before initialization completes. If initialization is not complete, a binding restart will cause this process to start over (I saw multiple binding restarts in your log). You can tell if initialization compete for a node by the existence of a file in userdata/zwave called network_XXXXXXXX__nodeNN.xml, where NN is the node number.

epicurean · January 28, 2019, 12:55pm

I saw a few of those files **_nodeNN.xml before. For a clean start , should I remove those files and then delete and try to add the z wave node 2 and 4 again?

When I try to set the association groups for node 2 and 4, I keep getting this error:

2019-01-28 20:52:24.578 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 2: Configuration update received

2019-01-28 20:52:24.579 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 2: Configuration update set group_1 to [controller] (ArrayList)

2019-01-28 20:52:24.579 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 2: Unknown association group 1

I suppose this has to do with what you just explained? that the z stick controller did not have enough information?

mhilbush · January 28, 2019, 1:01pm

No. Once created, those files should not be removed (unless there’s a need to reinitialize the node). The existence of those files reduce the amount of “work” the binding has to do at startup time (i.e. the binding reads the info from the file rather than having to query the device).

Probably.

I would focus on trying to get nodes 2 and 4 initialized.

If you’re on a linux platform, if you tail -f openhab.log, you should see a flurry of zwave log messages when you wake up a device. That would be an indication that the device is reporting to the binding that it’s awake.

epicurean · January 28, 2019, 1:11pm

Noted on the NOT removing the xml files.
But delete and include the things again? is the original order of the node numbering important?

both node 2 and 4 are now showing : NODE INITIALISING: VERSION in habadmin

mhilbush · January 28, 2019, 1:29pm

No. They appear to be included just fine.

There are several initialization stages after the VERSION stage. Are you seeing the WAKE_UPs in the log file?

epicurean · January 28, 2019, 1:47pm

Nope, no WAKE_UPs yet in the tail -f/logs/openhab.log

all still at Node Initialising:VERSION

Update:
Now node 4 is at Node Initialising:STATIC_VALUES. no change in node 2 yet

Update 2
Node 4 is at Node Initialising: ASSOCATIONS, no change in node 2.

Update 3
I finally see Node Initialising :SET_WAKEUP in node 4. Nothing in node 2. Wait…its back to ASSOCIATION again in Node 4, and then now at GET_CONFIGURATION

shorty707 · January 29, 2019, 8:39am

I see a extremely different performance on my zwave devices.
a lot of times it is just instantly reacting to commands … but sometimes the same device takes many seconds to respond. (I have a dense mesh network with ~80 nodes)

the log below is a qubino dimmer actor which I just turned off and on a few times … out of like 10 commands one gets very slow… the slow one is attached.
do you see a reson in the log for the performance decrease?

node56.txt (89.8 KB)

thanks for any hints

mhilbush · January 29, 2019, 11:45am

You’re referring to the ~17 seconds from when the command is received until the binding sends the MULTILEVEL_SET, right?

Is there any activity with the other nodes during this time (i.e. are you filtering the display on node 56)?

shorty707 · January 29, 2019, 1:24pm

Correct.
The JPG from @chris log viewer is filtered.

The attached log (TXT file) is unfiltered and the Full log for that Time Slot with all Traffic from Zwave (Debug)

Thanks!

mhilbush · January 29, 2019, 1:41pm

Ok, that’s what I suspected. I’m assuming that there are transactions in play with other nodes that are blocking the node 56 transactions. Those other transactions are possibly timing out, which easily could introduce multiple-second delays (the timeout is 5 seconds).

If you can post a link to the log on a file sharing service, I’ll take a look.

shorty707 · January 29, 2019, 5:44pm

@mhilbush
Hey its in the post above.
Not working?

mhilbush · January 29, 2019, 6:08pm

Ah, sorry, I was looking in the wrong post.

Looks like there are some issues with node 50. That’s what appears to be causing the delay in the transmission of the command to node 56. I also don’t know what would be causing the CANs.

shorty707 · January 29, 2019, 6:47pm

node 50 is also a qubino dimmer for the stair lights …

no clue why those rejected is in the log and also what the CANs which are not showing a node id actually are.

last hope is @chris who could have an idea what I could look into for finding the root cause

mhilbush · January 29, 2019, 7:46pm

When my dimmers get wigged out, a power cycle usually brings them back to normal operation. My Leviton dimmers have an air gap switch that avoids having to kill the power at the breaker. Not sure about the qubinos.

shorty707 · January 31, 2019, 6:38pm

I still stuggle here…the devices behave differently every time I restart the zwave bundle.

Does the log tell me there is a queue of 44 messages for node 55? or a queue for all devices?
anyways still the node responds so commands I sent - since it is a main. confused.

19:33:15.295 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - ZWaveReceiveThread queue empty
19:33:15.297 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - Transaction SendNextMessage 0 out at start. Holdoff false.
19:33:17.173 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - NODE 55: sendTransaction org.openhab.binding.zwave.internal.protocol.ZWaveSerialPayload@ae7e164
19:33:17.175 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - NODE 55: Adding to device queue
19:33:17.177 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - NODE 55: Added 29241 to queue - size 44
19:33:17.179 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - Transaction SendNextMessage 0 out at start. Holdoff false.
19:33:17.180 [DEBUG] [zwave.internal.protocol.SerialMessage] - Assembled message buffer = 01 05 00 51 37 28 B4 
19:33:17.183 [DEBUG] [ding.zwave.handler.ZWaveSerialHandler] - NODE 255: Port closed sending REQUEST Message = 01 05 00 51 37 28 B4 
19:33:17.185 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - Transaction SendNextMessage started: TID 29241: [WAIT_RESPONSE] priority=Controller, requiresResponse=true, callback: 40
19:33:19.186 [DEBUG] [nsactionManager$ZWaveTransactionTimer] - NODE 55: TID 29241: Timeout at state WAIT_RESPONSE. 3 retries remaining.
19:33:19.187 [DEBUG] [nsactionManager$ZWaveTransactionTimer] - TID 29241: Transaction is current transaction, so clearing!!!!!
19:33:19.188 [DEBUG] [ve.internal.protocol.ZWaveTransaction] - TID 29241: Transaction CANCELLED
19:33:19.189 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - NODE 55: notifyTransactionResponse TID:29241 CANCELLED
19:33:19.190 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - Transaction SendNextMessage 0 out at start. Holdoff false.
19:33:19.191 [DEBUG] [rnal.protocol.ZWaveTransactionManager] - TID 29241: Transaction event listener: DONE: CANCELLED -> 
19:33:19.193 [DEBUG] [ialization.ZWaveNodeInitStageAdvancer] - NODE 55: Node Init response (0) org.openhab.binding.zwave.internal.protocol.ZWaveTransactionResponse@15844f52

mhilbush · January 31, 2019, 7:25pm

No. It says there are 44 messages in the send queue. This covers all nodes (except the controller, which I think has its own queue). So it’s quite likely that, for example, there are messages in the queue waiting for battery-powered device(s) to wake up, lower priority messages, etc.

As an example, over a 6 month period (I keep the last 500 zwave log files), my average queue size is about 30 (the max was 99).

As we saw above from a sample of your log file, node 50 was introducing delay. If there are other nodes behaving similarly, that easily could explain why you’re seeing unpredictable performance with your devices.

Also, every time you restart the binding, the network will be slow initially. This is because the binding is busy initializing (i.e. interrogating) the devices in your network. For large networks, the initialization can take quite a few minutes – even longer if there are nodes whose transactions are timing out.

If it were me, I would identify the nodes causing issues, and either resolve why they’re behaving the way they are, or remove them from the network.

shorty707 · January 31, 2019, 8:02pm

Qubino DIN Dimmers… 16 of those… Hard to remove and replace.
There was a discussion some time ago that qubino devices were not sending status updates correctly due so something. (need to look it up)

Still they seem to behave strangely… And cause trouble on the network. I would not expect that for a main powered dimmer something should be queued.

If they send status updates or not seems to be different after every restart of OH or zwave bundle…which would point to differently set association group 1 issues during thing initialization. But only for qubino??

mhilbush · January 31, 2019, 8:14pm

Agreed.

Possibly the command repoll being too short at 1500 ms? Sometimes dimmers need a longer repoll period so that they have time to get to their final state.

Not normally. But I can imagine cases where things are queued. For example, during a repoll of a device with several channels, if the request for the first channel is delayed, the poll requests for the other channels will sit in the queue until the first request times out. If there are multiple devices that are polled around the same time, then their requests would sit in the queue, too. Not saying that’s the case here, but it’s definitely possible for a mains device to have requests sitting in the queue.

That’s odd. Unfortunately, I don’t have any experience with these devices.

Were you able to resolve anything with node 50? Did power cycling it help? Is it still experiencing timeouts? If so, imight be time to do an exclude, factory reset, include.

shorty707 · January 31, 2019, 8:54pm

also a Qubino DIN Dimmer. Without a power cycle it randomly works flawless or starts suddenly with strange stuff like in that.log. applies for all qubinos and since I got 16 my chances are high to get trouble and delays in the network.

well my real issue is. the device has 2 channels for the light status. a switch and a dimmer channel.
totally randomly I get updates for only 1 one of those (mostly dimmer and seldom switch) but also sometimes with no indication why I see both channels get their updates (e.g. Dimmer goes to 0 and switch to Off).since its unpredictable its impossible to put rules on that to automate stuff. grrr
this bevaviour however seems to be applied randomly after a zwave or OH restart and stays like this until another restart which made me wonder if it could be related to what the binding does when the device is initialzed.

c_nagel · February 5, 2019, 8:04am

Hi @chris,
to give you a short feedback:
All problems are gone! You know what? After playing around with my thermostat rules I realized a tremendous increase of log entries in the event.log files suggesting something like “command loops” between the controller and the thermostats. So I renamed the rules files to temporarely eliminate them … and everything was fine immedialtely! My smarthome works absolutely properly.
Conclusion: all the time before I didn’t realize that I don’t need rules for changing the thermostat setpoints when I change them via sitemap on my iPhone. That’s it.
Maybe this knowledge helps anybody with similar problems.
Anyway: thank you and @mhilbush and @sihui very much again for your support and sorry for beeing confused!
Regards, Christoph