Zwave1 to zwave2 migration frustration

zwave
binding
Tags: #<Tag:0x00007f0144b5f580> #<Tag:0x00007f0144b5f3f0>

(Markus Storm) #1

Hi @chris,

sorry to direct this right to you, but you’ve been extremely helpful in the past, and you’re the real savvy one here. I’m desperate, I have a long standing story of trying to get the OH2 zwave binding to fully work for my housing setup.
But I have a couple of battery zwave devices that just keep failing to initialize or, to be more precise, to get properly identified by the binding.
Thing is, they work fine with the OH1 legacy binding. There’s OH1 binding generated XML files for all of them.
I do find all of the bad devices in your device database, too.
For most of them, I do have identical devices that DO work. It’s all ‘standard’ multi sensors like Fibaro and Aeon gen 6 devices, the probably most uncommon ones are two Philio PST multi sensors. The only commonality seems to be it’s all battery multi sensors.
Running any of the OH2 bindings (2.1 release, latest snapshot or latest refactored), these devices fail to initialize.
Habmin and PaperUI keep telling me the type of node is unknown (FWIW, I’m running OH2.2 snapshot #1020 at the moment).
I’ve made various attempts at deleting things, rediscover them, reinitialize by clicking the buttons on them, wait, repeat, etc., re-include them, moved them close to the controller. I do see their zwave messages when I single- or triple-click.
I’ve purged/reinstalled OH multiple times and even tried copying (and properly edited) XML files from working siblings before restarting, but they all still fail to initialize. The binding still feels they’re unknown type node.

EDIT: I just noticed that the bad nodes are ALL tagged to be of thingTypeUID “qubino_zmnhda_00_000” in jsondb.
Might be coincidence because that type is the first or last in a list somewhere in the code and as such used as the default if the device cannot be identified.
For a bad node in question like 122, I have NOT seen any log entry saying ‘NODE 122: Device discovery resolved to thingType XXXX’ while for all working nodes that I’ve checked, I did find such a log line.

Some more oddities:
on OH2 (snapshot) binding start, there’s some error messages saying the (legacy binding generated) XML files cannot be (fully) parsed, but I think this is mostly unrelated because some of those incriminated nodes (such as the 42 and 19 shown below, which are Fibaro FGMS and FGD devices) still do work, and there’s affected nodes without XML files generated under the new binding (such as 122, a Fibaro FGSD-002).
Then again, that might be a pointer or not. 122 is a smoke detector device that I used to include, exclude, include multiple times, resulting in different nodeIDs, too. Same is true for various other (but not all) of the affected nodes.
I remember the 122 smoke detector even used to have a higher ID at some earlier point in time. That’s about the only occasion that I have seen my controller assign a lower number to a new device.
And 122 is a smoke detector that should not have a GARAGE_DOOR zwave class type, should it ?

2017-08-24 10:24:39.628 [ERROR] [ve.internal.protocol.ZWaveController] - NODE 42: Restore from config: Error deserialising XML file. com.thoughtworks.xstream.converters.reflection.AbstractReflectionConverter$UnknownFieldException: No such field org.openhab.binding.zwave.internal.protocol.ZWaveNode.deviceClass
---- Debugging information ----
field               : deviceClass
class               : org.openhab.binding.zwave.internal.protocol.ZWaveNode
required-type       : org.openhab.binding.zwave.internal.protocol.ZWaveNode
converter-type      : com.thoughtworks.xstream.converters.reflection.ReflectionConverter
path                : /node/deviceClass
line number         : 2
version             : 1.4.7
-------------------------------

2017-08-24 10:27:38.682 [ERROR] [ve.internal.protocol.ZWaveController] - NODE 19: Restore from config: Error deserialising XML file. com.thoughtworks.xstream.converters.reflection.AbstractReflectionConverter$UnknownFieldException: No s
uch field org.openhab.binding.zwave.internal.protocol.ZWaveNode.healState
2017-08-24 10:27:38.948 [ERROR] [ve.internal.protocol.ZWaveController] - NODE 122: Restore from config: Error deserialising XML file. com.thoughtworks.xstream.converters.ConversionException: No enum constant org.openhab.binding.zwave.i
nternal.protocol.ZWaveDeviceClass.Generic.GARAGE_DOOR : No enum constant org.openhab.binding.zwave.internal.protocol.ZWaveDeviceClass.Generic.GARAGE_DOOR

I do not see anything wrong about the initialization process, here’s the (grep’ed-out) messages for that node 122 if run under the refactored binding.
I’m probably missing some important messages, but there’s so many of them when the binding starts initializing all nodes…

2017-08-24 10:20:29.343 [DEBUG] [l.initialization.ZWaveNodeSerializer] - NODE 122: Serialise aborted as static stages not complete
2017-08-24 10:24:35.838 [INFO ] [age.SerialApiGetInitDataMessageClass] - NODE 122: Node found
2017-08-24 10:24:37.015 [DEBUG] [ve.internal.protocol.ZWaveController] - NODE 122: Init node thread start
2017-08-24 10:24:39.254 [DEBUG] [l.initialization.ZWaveNodeSerializer] - NODE 122: Serializing from file /var/lib/openhab2/zwave/network_c602b032__node_122.xml
2017-08-24 10:24:39.348 [DEBUG] [l.initialization.ZWaveNodeSerializer] - NODE 122: Error serializing from file: file does not exist.
2017-08-24 10:24:40.263 [DEBUG] [alization.ZWaveNodeInitStageAdvancer] - NODE 122: Starting initialisation from EMPTYNODE
2017-08-24 10:24:41.407 [DEBUG] [ve.internal.protocol.ZWaveController] - NODE 122: Init node thread finished
2017-08-24 10:24:41.413 [DEBUG] [alization.ZWaveNodeInitStageAdvancer] - NODE 122: Node advancer - advancing to IDENTIFY_NODE
2017-08-24 10:24:41.814 [DEBUG] [alization.ZWaveNodeInitStageAdvancer] - NODE 122: Node advancer: Initialisation starting
2017-08-24 10:25:09.313 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Controller status changed to ONLINE.
2017-08-24 10:25:09.315 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Controller is ONLINE. Starting device initialisation.
2017-08-24 10:25:09.323 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Updating node properties.
2017-08-24 10:25:09.326 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Updating node properties. MAN=2147483647
2017-08-24 10:25:09.330 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Properties synchronised
2017-08-24 10:25:09.338 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising Thing Node...
2017-08-24 10:25:09.340 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:switch_binary for OnOffType
2017-08-24 10:25:09.341 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:switch_binary for OnOffType
2017-08-24 10:25:09.344 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:switch_dimmer for OnOffType
2017-08-24 10:25:09.346 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:switch_dimmer for PercentType
2017-08-24 10:25:09.347 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:switch_dimmer for PercentType
2017-08-24 10:25:09.351 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:sensor_temperature for DecimalType
2017-08-24 10:25:09.352 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:sensor_temperature for DecimalType
2017-08-24 10:25:09.361 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:meter_kwh for DecimalType
2017-08-24 10:25:09.363 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:meter_kwh for DecimalType
2017-08-24 10:25:09.365 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:meter_watts for DecimalType
2017-08-24 10:25:09.366 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:meter_watts for DecimalType
2017-08-24 10:25:09.368 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:sensor_binary1 for OnOffType
2017-08-24 10:25:09.370 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:sensor_binary1 for OnOffType
2017-08-24 10:25:09.371 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:sensor_binary1 for OnOffType
2017-08-24 10:25:09.386 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising cmd channel zwave:device:ddd23e6c:node122:sensor_binary2 for OnOffType
2017-08-24 10:25:09.388 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:sensor_binary2 for OnOffType
2017-08-24 10:25:09.405 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Initialising state channel zwave:device:ddd23e6c:node122:sensor_binary2 for OnOffType
2017-08-24 10:25:09.406 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Polling intialised at 1800 seconds - start in 1800000 milliseconds.
2017-08-24 10:25:09.409 [DEBUG] [ding.zwave.handler.ZWaveThingHandler] - NODE 122: Device initialisation complete.

In short, I’m stuck. Please get me a hint what to try or watch for. I really don’t know how to approach the problem.
For a start, I feel that one of the most annoying parts of the problem is that there’s no “Re-initialize” button available in habmin as long as the node is of unknown type.
Please could you change that ? I don’t know a means to trigger initial initialization (or identification, actually) other than to fully restart OH. Yes, of course I will still need to press the button(s), but at least that would speed up the process A LOT.

Any other idea or procedure to point me in the right direction is highly appreciated, too.
I really want to nail this thing down this time.

thanks in advance
Markus

PS: why I’m so frustrated:
It’s extremely stressful and time consuming for me to switch forth and back to see if a particular change makes a difference as this is my production config (the house I live in :)). I have a slow Raspi server, restarting OH often does not work (hangs, some Karaf or OH2 module dependency problems, not sure), lots of rules to process, and every switch results in a backlog of zwave messages whenever the zwave binding restarts to take half an hour to process. I’ve done that lots of times, and I’m fed up with it.
Then again, it’s extremely annoying, too, to keep running the house on OH2 with zwave1, as there is no working habmin, so I have to restart all of OH to temporarily use habmin2 (and zwave2 binding) whenever I need to change any of my zwave device’s configs (partial restarts, i.e. binding exchange only as per Karaf console, never seem to work, always results in some hanging state).


(Chris Jackson) #2

Hi Markus,
I’m sorry you’re having problems, and I’m really not sure from the message I can comment too much on specifics. I’m happy to look at some logs to see what’s up though.

The file you are loading must be one from the old OH1 binding which has the GARAGE_DOOR type - the OH2 binding doesn’t have this (hence the error), so this isn’t an OH2 problem, but just an old problem with OH1.

I would suggest to delete all XML files for starters so that there’s no confusion with old OH1 files.

I don’t understand this - I thought elsewhere you said they were unknown? If they have a thingTypeUID set, then they should show up as this thing type :confused:.

I don’t understand this either - sorry. Node numbers can only change when a device has been excluded and re-included, so I can’t see how a node number can “change”.

I don’t especially mind changing this logic so that the button is there all the time, however I’m suspicious that it won’t help. If the device isn’t completing initialisation, then there’s probably an underlying reason that we should instead try to find.

I think we need to work through logs - really the information above doesn’t help as it’s just listing issues that I simply don’t understand, and I’d like to see logs to work out what’s really happening…


(Markus Storm) #3

Yes, I’m pretty confused by a number of details, too. And I’ve tried so many little things that I fail to properly remember each and every step, making it impossible to reproduce and fully understand how I arrived there.
But it’s hard not to get nervous and to stay methodical when your house is down and your wife starts knocking at the door :slight_smile:

  • on the GARAGE_DOOR, it’s likely because I kept using the same directory for the XML files when switching forth and back between OH1 and (snapshot) OH2 binding which (unlike your refactored binding) are both using the same file names. Yes I can delete then, but on switching back to OH1, they’re regenerated to include that field again. Yes I can use separate directories. But as I already suspected myself (and thanks for confirming), this is not related to THE problem.

  • when I wrote the nodes were unknown I was referring to habmin and PaperUI output saying just “Z-Wave Node XXX” instead of also showing the HW type like they do for working nodes.
    AFTER posting, I realized the fact that thingTypeUID actually nonetheless is set in the respective jsondb and edited my post. I had hoped this is giving sort of a hint to you.
    Sorry for creating confusion. So yes, thingTypeUID is set, but wrongly so. There’s no bad Qubino devices. And the node records are incomplete, there’s lots of missing fields when compared to a working, otherwise identical node.

  • on the ID number that changed: sorry, my description was probably confusing in the first place.
    I meant to say that I actually excluded the device, actually I’ve even factory-reset it, and then re-included that device and only that full procedure resulted in a new node ID. I was just wondering why that new one was lower than the old one.
    A possible explanation (although I don’t fully remember) is that at times, I used Z-Way to include devices. It has a function that allows to replace a failed (battery) device by a new one while keeping the node ID, so the new device is not assigned the next (i.e. highest) number but that old one. Again, I don’t think it’s related to THE problem, at least not directly.

  • On the (re)initialization button: I agree, it’s a workaround at best. It might help a bit, though, because I was hoping that that way, we can selectively initiate the identification process for a single of the broken devices instead of doing it for all of them (plus the working ones) in one go because that means we have to dissect a really huge log.

Ok, let’s dive into logs and try together to get it straight, thanks for your offer to help.
Please advise how to proceed.
Do you want me to restart with your refactored binding and zwave TRACE logging, then hit the discovery button in habmin?
Anything else? Want me to delete the XML files first (those network_xxx_node_yyy ones the refactored code is using)?
To delete the broken things before restarting ? Or to delete the whole jsondb ?

Are there any further files you want me to provide ?
I guess logs are gonna be huge so I can’t simply upload the file to the forum, and the forum can’t take tars,
so I guess I’ll better post them via a fileshare provider.

PS: on a sidenote, your refactored binding still claims to be 2.1.0-something. I assume that’s just cosmetics and it’s supposed to work with my current 2.2 snapshot OH version, am I right?


(Chris Jackson) #4

Yep - I can appreciate that :wink:

Ok, I’ve no idea how that can happen. The routine to set the thing type searches the database to match the manufacturer ID, device type and device id. It shouldn’t be able to get this wrong - I’d really need to see the log as to why this is set this way.

The controller will select the next unused ID. It will start with ID 2, then 3, etc, up to 232, then go back to use low numbers that are not used. If you have been including devices a lot, then it might wrap around and use lower numbers again. This won’t be an issue though.

No - let’s just use DEBUG - it’s got everything that’s normally needed and TRACE will just make the files bigger (and even more unmanageable).

I suggest to remove the XML files, possibly also delete the log so we know where we’re starting, then restart the binding. Give it a minute or two, then wake up a battery device or two. Then send me the log. I might want the XML files, but let’s start with the log.

Can you open a ticket on my website? Once registered/logged in there’s an option in the menu to open a ticket and you can attach large files there. If you can keep the files to 10MB or less it makes it more manageable (I think by default OH should do this). Possibly the best thing is to zip the logs and any XML files generated give it the current date and attache that file to the ticket system.

Yes - don’t worry about this - it will work with OH2.2.