Lots of CANCELLED transactions in z-wave log after upgrading to OH 2.5

roher · December 25, 2019, 9:07pm

Hello,
After upgrading Openhab (and Z-wave binding) to version 2.5 I see lots of CANCELLED transactions in the z-wave log, for example:

2019-12-25 21:27:38.077 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 40: Bump transaction 4637 priority from Immediate to Immediate
2019-12-25 21:27:38.077 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 40: Adding to device queue
2019-12-25 21:27:38.077 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 40: Added 4637 to queue - size 18
2019-12-25 21:27:38.078 [DEBUG] [nal.protocol.ZWaveTransactionManager] - Transaction SendNextMessage 1 out at start. Holdoff false.
2019-12-25 21:27:38.128 [DEBUG] [sactionManager$ZWaveTransactionTimer] - NODE 40: TID 4558: Timeout at state WAIT_REQUEST. 3 retries remaining.
2019-12-25 21:27:38.128 [DEBUG] [sactionManager$ZWaveTransactionTimer] - TID 4558: Transaction is current transaction, so clearing!!!!!
2019-12-25 21:27:38.128 [DEBUG] [e.internal.protocol.ZWaveTransaction] - TID 4558: Transaction CANCELLED
2019-12-25 21:27:38.128 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 40: Got an event from Z-Wave network: ZWaveTransactionCompletedEvent
2019-12-25 21:27:38.129 [DEBUG] [nal.protocol.ZWaveTransactionManager] - NODE 40: notifyTransactionResponse TID:4558 CANCELLED
2019-12-25 21:27:38.129 [DEBUG] [nal.protocol.ZWaveTransactionManager] - Transaction SendNextMessage 0 out at start. Holdoff false.
2019-12-25 21:27:38.130 [DEBUG] [wave.internal.protocol.SerialMessage] - Assembled message buffer = 01 09 00 13 28 02 84 08 25 7E 18
2019-12-25 21:27:38.130 [DEBUG] [ing.zwave.handler.ZWaveSerialHandler] - NODE 40: Sending REQUEST Message = 01 09 00 13 28 02 84 08 25 7E 18
2019-12-25 21:27:38.130 [DEBUG] [nal.protocol.ZWaveTransactionManager] - TID 4558: Transaction event listener: DONE: CANCELLED ->

Soon after upgrade all my wired Z-Wave devices were marked as DEAD, and went OFFLINE in the Things list. I’ve stopped Openhab, reinserted my Z-Wave USB controller into the PC - in order to reset it by power - and launched Openhab again. It was yesterday. No devices went offline after that, but I still see lots of CANCELLED transactions in the logs.

Attaching all logs collected after restart for reference (they are archives - remove “.log” extension to open):
Logs.zip.001.log (1023.9 KB)
Logs.zip.002.log (1023.9 KB)
Logs.zip.003.log (982.3 KB)

That said, devices are functioning almost normally at the same time, even when they were DEAD and OFFLINE - their statuses are being updated, they are receiving commands, except that sometimes there are delays of several seconds (which were quite rare before), and 2 or 3 times devices didn’t receive their commands.

I’ve also tried to update config parameter of one of my battery devices (waking it up after saving), and transaction was cancelled as well, updated value was not applied.

Before upgrade I was using Openhab 2.4 and Z-Wave 2.5.0.201901040921 snapshot binding, and have never seen such behaviour.

Bruce_Osborne · December 25, 2019, 9:16pm

Verify you have no invalid addons defined such as restdocs (moved from misc to ui) and yahooweather (removed due to service stoppage). Invalid addons cause all addoins to be reloaded evry minute in an attempt to load the invalid ones.
Check addons.cfg and addons.config. The addons changes were mentioned in upgrade Warnings and the Release Notes.

roher · December 26, 2019, 7:08am

@Bruce_Osborne, checked that already - I don’t have invalid addons, here is my addons.cfg:

package = expert

# Access Remote Add-on Repository
# Defines whether the remote openHAB add-on repository should be used for browsing and installing add-ons.
# This not only makes latest snapshots of add-ons available, it is also required for the installation of
# any legacy 1.x add-on. (default is true)
#
remote = true

# Include legacy 1.x bindings. If set to true, it also allows the installation of 1.x bindings for which there is 
# already a 2.x version available (requires remote repo access, see above). (default is false)
#
legacy = true

# A comma-separated list of bindings to install (e.g. "sonos,knx,zwave")
binding = amazondashbutton,amazonechocontrol,astro,exec1,netatmo,squeezebox,weather1,zwave

# A comma-separated list of UIs to install (e.g. "basic,paper")
ui = paper,basic,habmin,habpanel

# A comma-separated list of persistence services to install (e.g. "rrd4j,jpa")
persistence = jdbc-mysql,mapdb

# A comma-separated list of actions to install (e.g. "mail,pushover")
action = telegram

# A comma-separated list of transformation services to install (e.g. "map,jsonpath")
transformation = map,javascript

# A comma-separated list of voice services to install (e.g. "marytts,freetts")
voice = voicerss

# A comma-separated list of miscellaneous services to install (e.g. "myopenhab")
misc = openhabcloud

Bruce_Osborne · December 26, 2019, 10:09am

Most people have found their issues in the addons.config file located in userdata.

roher · December 26, 2019, 2:23pm

I don’t have any duplicates in addons.config in userdata, here it is:

:org.apache.felix.configadmin.revision:=L"3"
action="telegram"
binding="amazondashbutton,amazonechocontrol,astro,exec1,netatmo,squeezebox,weather1,zwave"
legacy="true"
misc="openhabcloud"
package="expert"
persistence="jdbc-mysql,mapdb"
remote="true"
service.pid="org.openhab.addons"
transformation="map,javascript"
ui="paper,basic,habmin,habpanel"
voice="voicerss"

Bruce_Osborne · December 26, 2019, 2:25pm

Hmmm. Looks OK to my untrained eye. Perhaps somebody else has ideas.

roher · December 26, 2019, 2:26pm

I hope @chris & @sihui, our Z-Wave experts, will be able to help

Bruce_Osborne · December 26, 2019, 2:27pm

They will want debug logs as detailed in the binding instructions.

roher · December 26, 2019, 2:28pm

Already attached in the initial message.

Bruce_Osborne · December 26, 2019, 2:33pm

That looks filtered though and the unfiltered log is needed to see what is happening on the network.

There is also a log viewer here.

https://www.cd-jackson.com/index.php/openhab/zwave-log-viewer

roher · December 26, 2019, 2:34pm

Nope, I didn’t filter anything - took the files from the log folder as is, just zipped them into a multipart archive.

Bruce_Osborne · December 26, 2019, 2:36pm

Sorry, I just looked at within the code fences in that post.

EDIT: I see you were using a snapshot binding before you upgraded. Did you delete all of the manually installed addons to get the newer versions?

roher · December 26, 2019, 2:41pm

Yes of course, I have removed the jar file from addons folder, and included “zwave” into addons.cfg, so that 2.5 release version has been installed. Same with another binding which was on snapshot version as well.

sihui · December 26, 2019, 3:11pm

I am seeing some timeouts and things going offline after your nightly heal, it might be worth a try to disable nightly heal if you don’t plan to move your devices:

Commands are rejected from the controller for different nodes:

Neighbor updates are failing:

State updates are sent multiple times for different devices, are you sure you have the association groups set correctly (only one (the correct) group to your controller, not several groups, this can cause issues):

roher · December 26, 2019, 3:53pm

Hi, thank you for the answer, will try to comment on each point:

Ok, will try. I thought nighly healing has only positive effect

Yes, I’ve seen this in the logs, but the main question why has it started to happen after upgrade? Everything was ok before, and I didn’t perform any hardware changes (except adding new Popp Mold sensor, discussed in another thread).
27, 56, 63 are all Wired devices and not so far from the controller.

No, device 37 has only one association with the controller - device status group:

Maybe this is because this is a multichannel device (Fibaro multisensor), and updates are coming from different sensors?

BTW, again, part of my wired devices are gone offline (sorry for russian device names on the screenshot):

sihui · December 26, 2019, 4:44pm

Strange, sorry, no idea.

There have been issues for several users if the nightly heal is enabled:

roher · December 26, 2019, 6:58pm

Now all my wired devices have gave offline. But they are “working” - receiving commands and sending status updates.

And all of them are stuck on the NodeNeighborUpdate failed, and due to this they cannot complete initial initialization after binding restart, so looks like healing isn’t the cause (I’ve recalled one detail - I’ve tried to heal one of wired devices manually via Habmin, and the log told me smth like “healing cancelled because initialization is not completed”).
I’ve also seen NodeNeighborUpdate failed for battery devices, but they are not going offline after that.

I start thinking that this NodeNeighborUpdate request has failed for literally all devices in my network.

Also noted one more interesting message in the fresh log:

2019-12-26 19:17:53.820 [DEBUG] [nal.protocol.ZWaveTransactionManager] - Completing UNINTIALIZED transaction 7130!!! How?!?

That’s ok, not all ideas come right away at the moment.
@chris, maybe you will have some?

BTW, I’m a Java developer myself, so if I can help you guys somehow, I’m ready. The only problem is lack of time, as usual

sihui · December 26, 2019, 7:01pm

Did you try a soft reset on your controller? (power down the server, physically remove the stick, plug it back in, start the server)

roher · December 26, 2019, 7:16pm

I’ve tried it on 24th, the first time when all my wired devices went offline:

As you can see, it didn’t help.

I have one idea, will try it tonight and post here if it will work.

Bruce_Osborne · December 26, 2019, 8:28pm

I think stopping & restarting the zwave binding should trigger a full device discovery cycle.