Z-wave things lose their Lifeline Association Group

Landstad · April 27, 2018, 11:18pm

Openhabian on rasberry pi. Latest snapshot.

I have a few things that every ones in a while lose their lifeline.

For a few days everything works well, but suddenly (for example) clicking on my Node Octan will not make the music start. I go check the log and it says “[me.event.ThingUpdatedEvent] - Thing ‘zwave:device:mydevice:nodeXX’ has been updated.”, but there will be no log entry for Item change.

I’ve learned to quickly check the association group (AG) in habmin, find that AG 1 Lifeline is missing and add the controller to the lifeline. Then it works fine for a few days again.

What is happening? Why is the thing continuously losing its lifeline AG?

PS: The controller is the only AG for all things (i e, I only have the lifeline added to things).
PS2: I get 500-errors when trying to change things in paperUI, java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.Integer. I have not been able to figure out what causes this.

Landstad · October 3, 2018, 9:45am

Update:

(Some of) my things still lose their lifeline once in a while. I think it only happens after:

Disconnecting the controller in order to go include new things
Upgrading
Restart of the pi/services

The 500-errors stopped at some point during this summer - probably a version upgrade.

chris · October 3, 2018, 9:56am

This is likely just a UI issue as there is a known bug in HABmin caused by the widget that is used to display the select boxes. Unfortunately it’s not maintained so updating HABmin to resolve this is difficult.

Landstad · October 3, 2018, 10:50am

This might be the case sometimes, BUT:

It’s the same in paperUI and things stop working; i e the controller does not pick up any events. Often the latter is how I figure out that a thing has lost its lifeline association.

chris · October 3, 2018, 10:54am

The binding should not change the associations when it starts - only when a device is reinitialised. If you use PaperUI, be aware there is a similar bug there, and this may contribute to the problem since when you save the configuration in PaperUI, it will send ALL updates to the binding and this might cause it to remove associations.

In any case, please provide a debug log file showing the issue, and I will take a look.

Landstad · October 5, 2018, 6:40am

Thanks. I appreciate that, but it’s hard to pinpoint what in the log that would make any sense to send.

Regarding the bug you are writing about:
I have a hard time saving associations for some things (especially a Fibaro Swipe - the others are OK now). This might be related. Do you know if there is a fix for this bug on its way?

chris · October 5, 2018, 6:43am

Ok, but then you must appreciate it’s not possible to find the issue. You need to spend some time trying to replicate the problem you are reporting so that there is something to work on.

Landstad · October 5, 2018, 6:53am

For sure @chris! If I find a way to replicate consistently and pinpoint log items that seem relevant I’ll update.

My question regarding a fix was however regarding the UI error in habmin/paperui that you wrote about. This error has probably added to my confusion so a fix might help me with #1.

chris · October 5, 2018, 7:09am

I agree that the error in HABmin, and I think also a similar one in PaperUI is annoying (or, more than annoying), but I’m not sure how to resolve this as it’s due to a 3rd party widget that isn’t supported now, and I’ve not managed to find a replacement.

Landstad · October 8, 2018, 7:33pm

Hello again,

I still have (at least) two nodes that I cannot get association groups assigned to.

Most other nodes that I include get the node_1 association by default, but I have tried excluding and reincluding and they get no association group.

When I try to set the association group the things get “pending…” status in habmin. In paperUI everything looks alright until I restart the service or disconnect the controller (not sure which). Then they return to null again.

Here is a log file from when trying to first update one node (#45) in paperUI, thereafter in habmin (node #39) with controller added.

In info log it only reads things like “configStatusMessages=[ConfigStatusMessage [parameterName=config_52_2, type=PENDING, messageKey=null, arguments=null, message=null, statusCode=null” for all parameters.

I have tried most things I believe and am pretty stuck. Any tips appreciated!

dastrix80 · October 8, 2018, 7:54pm

This has been happening to me since using zwave

chris · October 9, 2018, 7:04am

Please try to help with debugging the issue then - it will help you, and everyone else. Groups should not just “disappear” by themselves so you should be able to find something in the logs that we can look at.

dastrix80 · October 9, 2018, 7:11am

hi chris next time it happens ill have a look for it… When i change my dimmer to anything more than 0, the switch should turn on. When it doesnt, I always know its the LifeLine at fault.

chris · October 9, 2018, 7:17am

Thanks. Note that the problem will have occurred well before this - probably it is either during binding startup (I think this is unlikely) or when you change the configuration of the device (this would be my expectation). Either the UI sends something wrong, or the binding is interpreting it incorrectly…

chris · October 9, 2018, 7:52am

@Landstad do you have this log still? Can you provide it from a bit earlier please? There is a command in the log to remove the associations from the lifeline group, but the log doesn’t actually show this command being generated (just sent). I suspect that it might have been generated earlier, and then stuck in the TX queue and sent a lot later (as it is a low priority message). Your queue is quite long at this point so I think this explains what I see, but I’d like to see where this command is being generated if possible.

For the PENDING messages, if this is node 39, then I guess that this is a battery device since I see nothing in the log at all for this device. If so, the pending messages are normal - the configuration will stay like this until the battery device wakes up an the data can be transferred. Pending is removed after the binding receives the next report of the configuration state.

Also, please confirm you are using a binding no older than 1 week old as there was a bug fix with queue prioritisation that might also be a factor in this. I’m not completely sure this would explain the issue anyway, but we should make sure this is the latest.

Landstad · October 9, 2018, 7:58am

Hello,

Thanks for having a look.

I started the debug-log at 10 past and:

Added an ass group via PaperUI
Added an ass group to another thing via habmin (node 39 - not a battery device)

I copied everything from 10 past onward, so relevant log items should be there.

Both nodes are non-battery-nodes (Fibaro Dimmer 2).

Both nodes react to commands sent, but do not report anything.

Regarding the binding:
I think I deleted and re-added the things sunday morning, but I can do it again tonight just to make sure.

chris · October 9, 2018, 8:11am

Unfortunately not. At least, the updates that you made did not generate these commands - they must have been generated earlier and were already in the queue (there were 49 messages already queued when this log started). At least this shows that the binding would not have removed the association as a result of this update, but something DID remove the association, but it’s not in this part of the log.

If you mean node 39, there are no messages at all logged for this node. It may just be that with the queue length, the log doesn’t yet have these commands in them. I do see the commands from the UI, but within this log, nothing actually gets sent to the device.

Note that “commands” like turning on and off a light are considered high priority and will go straight to the top of the queue. Configuration is considered low priority and will be sent when there are no higher priority commands to send.

What hardware are you using? Is this a slow computer?

There is no need to delete anything - just update to the latest snapshot. If you are using the snapshot runtime, then all you need to do is to uninstall the binding from PaperUI, wait 30 seconds, then install it again.

Landstad · October 9, 2018, 8:25am

Hello,

I think we are writing past each other.

The log shows me trying to add association groups (both node 45 and 39, and both are non-battery-devices). For these two nodes the association group got lost at some point when switching to 2.4 and I haven’t been able to add association groups since.

I am running openhabianpi an rPi 3, latest snapshot.

chris · October 9, 2018, 8:45am

That’s understood, but the log doesn’t have all the information in it as the logs are not complete. By this I mean there was a lot that happened before the log starts, and the queue already has some stuff in it.

The log shows that for node 42, the controller is set, but sometime before the log starts, something has queued a message to remove this, but due to prioritisation, this command is not sent for quite a long time. I would like to see what created this message, but unfortunately it’s not in the log, so I’ve no way to know where this was generated.

If you don’t have the full log, then unfortunately I can’t work out why this is happening…

Ok, that may be a small contributor to the queue size at least. Note that this shouldn’t be an issue, but helps me understand.

Landstad · October 9, 2018, 8:47am

Okay. I will do the same tonight and include the past 20 minutes of logging as well.