Zigbee devices lost after reboot

I have 2 zigbee devices and the HUSBZB-1 adapter… I had to reboot my pi for the first time since installing everything and when it came online, my zigbee devices don’t connect. Paper UI lists them as online, but they get no state updates, nor do they take commands.

I also see

2019-05-18 01:46:59.136 [ERROR] [converter.ZigBeeConverterSwitchOnoff] - 000D6F0005077029: Error 0xffff setting server binding
2019-05-18 01:47:07.144 [ERROR] [converter.ZigBeeConverterSwitchOnoff] - 000D6F0005077029: Error 0xffff setting client binding

Listed in the logs… Further reboots don’t seem to help… I’ve read other topics where people delete the binding and all devices and that fixes it… But will I really have to do that on every reboot?

Is there another way to fix this without having to do that every time?

I have a couple Z-Wave devices on the same USB stick and they come online fine… If that matters…

zigbee.log (43.3 KB)

See attached zigbee logs

It seems to be sending and receiving, but has a lot of Unhandled EZSP Frame entries…

And here’s my coordinator definition:

Thing zigbee:coordinator_ember:stick1 “Zigbee USB Stick” [zigbee_port=“/dev/ttyUSB1”, zigbee_baud=“57600”, zigbee_flowcontrol=2]

I bound the items inside paper ui though. (maybe related?)

You don’t say what version you are using? Is it quite old (I think so from the log)? If so, I would recommend using a newer snapshot for starters.

@chris, will the libraries in the OH snapshots be getting updated to 1.1.11? Both the OH snapshot and library release are at 1.1.6. I’ve find 1.1.11 much more stable, but as reported, the binding is still crashing at least once a day.

No - in the coming days we plan to bump this to the latest release which is 1.2.0. I’m in the US over the next week, but hopefully it will still get done this week.

Note that 1.2 is NOT compatible with the current binding so don’t try and use it yet.

1 Like

I dont see the binding crashing, (using 2.5 dated 20190420 and lib 1.11). I just see quite alot of instability. Even without reboot/restart of OH, devices are dropping out of nowhere.

Last night I started all over with my Zigbee devices. Re-added everything including the coordinator. I also managed to get the Xiaomi Aqara window/door added (with just a switch channel) with quite alot of struggle.
Few hours later, first the Aqara dropped to offline.
Abit later, my Osram Plug went offline
And a few minutes ago, the Trust motion sensor went offline.
Hue Dimmer switch and Hue motion sensor is still running. But if everything goes as it use to, these will change to offline as well within the next day. And at last the coordinator will go offline, making it impossible to get back online without remove and re-add it, which result in a new device, and therefore I have to change all my items as well.

Whats worrying is, this happens even without doing anything. They simply turn to offline out of nowhere!!

This is what I typical see with the zigbee devices. And Im close to give up… It´s simple too much hassle even trying to investigate issues like this, specially when I also has to deal with the serial driver issues in between, breaking the connection to the zigbee sniffer a well. There has to be at least some stability to even do decent investigation.
Normally I like to investigate issues, but this has become a nightmare, to be honest! :frowning:

EDIT -
Just as I wrote the above, the Aqara sensor went ONLINE in PaperUI. But it still doesn´t respond.
But the Hue Dimmer Switch is no longer responding. (it worked a few minutes ago). It´s still reported ONLINE though…

For me, the coordinator says it’s online but no devices are responding. Do you see these warnings in the log? I ugraded OH and the binding today. Zigbee went from wonderfully stable to having to restart once a day around the time the binding was migrated to the new build system. I’m also on Ember (HUSBZB-1). I haven’t yet tried going back to an older version, but I don’t think I can wait for 1.2.

1.1.6 is what the cli listed. I used the upgrade script here: https://github.com/openhab-5iver/openHAB-utils to install the 1.1.11 snapshot.

Attached are the new logs from openhab startup.

The things still seem unresponsive though.

startup.log (210.0 KB)

How have you configured the child aging?

This is a router, so really shouldn’t drop off the network - maybe it’s close to being out of range, or there is interference…

This is normal if the device stops responding isn’t it? Of course, the big question is why it is not able to be contacted, but if it does stop responding, it should be marked offline.

What do you mean “it doesn’t respond”? It’s a sensor - it doesn’t do anything that requires a response does it?

Possibly there are issues relating to your system configuration. We run many (many!) thousands of instances of this code - not with OH, but other systems - all running Ember and the same ZSS library etc, and all work pretty well.

Possibly there is some sort of serial issue - as I said on github, the error is related to invalid data getting to the NCP. The code hasn’t changed for a long time, so I don’t think it’s an issue with the libraries directly at least.

What version of the system do you run? There are errors in the log that I suspect are caused by the core OH being too old for the binding. I’m not sure if that’s the only issue - there are no real responses to messages on the ZigBee network which is probably unrelated to the framework service errors and class not found errors.

When devices don’t respond it’s hard to know why from these logs since there is just no data.

I’m on the openhabianpi-raspbian-201804031720-gitdba76f6-crc9e93c3eb.img.xz image. It should update to latest stable I think…

Do I need to switch it to testing or unstable to link properly?

I think this is required to work properly since the binding is making use of something new that isn’t in the version from your OS which is now over 12 months old (I suspect there is a newer stable as well).

As I said though - I’m not sure this will solve the issue - it’s hard to see what’s happening when there are no responses in the log, but the exceptions in the log clearly indicate the version incompatibility.

I don’t know much about openHABian, but IIRC OH upgrades are not automatic, and that looks to be an old version of openHABian too. You are probably on OH 2.3. You’ll see a version when you go into the Karaf console.

Ok, I no longer get those exceptions… but.

On startup, I get different values for
Network key final array
Created random ZigBee PAN ID
Created random ZigBee extended PAN ID

eg.

2019-05-18 15:29:11.489 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link key final array 5A6967426565416C6C69616E63653039
2019-05-18 16:48:07.473 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network Key 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2019-05-18 16:48:07.475 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link Key 5A 69 67 42 65 65 41 6C 6C 69 61 6E 63 65 30 39
2019-05-18 16:48:07.477 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Config: zigbee_initialise found, initializeNetwork=false
2019-05-18 16:48:07.479 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - ExtendedPanId or PanId not set: initializeNetwork=true
2019-05-18 16:48:07.481 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network Key String 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2019-05-18 16:48:07.485 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network key initialised 2A71F8C677E8AD9D6BE9CD5F33121FE0
2019-05-18 16:48:07.487 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network key final array 2A71F8C677E8AD9D6BE9CD5F33121FE0
2019-05-18 16:48:07.489 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link Key String 5A 69 67 42 65 65 41 6C 6C 69 61 6E 63 65 30 39
2019-05-18 16:48:07.491 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Initialising network
2019-05-18 16:48:07.494 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Channel set to 11.
2019-05-18 16:48:07.514 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Created random ZigBee PAN ID [3B78].
2019-05-18 16:48:07.524 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Created random ZigBee extended PAN ID [C589E4BB2B3CFB01].
2019-05-18 16:48:07.537 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link key final array

vs

2019-05-18 16:43:58.417 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network Key 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2019-05-18 16:43:58.419 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link Key 5A 69 67 42 65 65 41 6C 6C 69 61 6E 63 65 30 39
2019-05-18 16:43:58.421 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Config: zigbee_initialise found, initializeNetwork=false
2019-05-18 16:43:58.423 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - ExtendedPanId or PanId not set: initializeNetwork=true
2019-05-18 16:43:58.425 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network Key String 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2019-05-18 16:43:58.431 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network key initialised 47F1F68C2F87B7A705D7EA63A6BA9A04
2019-05-18 16:43:58.441 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Network key final array 47F1F68C2F87B7A705D7EA63A6BA9A04
2019-05-18 16:43:58.444 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link Key String 5A 69 67 42 65 65 41 6C 6C 69 61 6E 63 65 30 39
2019-05-18 16:43:58.451 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Initialising network
2019-05-18 16:43:58.453 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Channel set to 11.
2019-05-18 16:43:58.481 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Created random ZigBee PAN ID [3E10].
2019-05-18 16:43:58.502 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Created random ZigBee extended PAN ID [7AC8651F33C5D01B].
2019-05-18 16:43:58.526 [DEBUG] [bee.handler.ZigBeeCoordinatorHandler] - Link key final array 5A6967426565416C6C69616E63653039

Since the PAN ID is changing, wouldn’t that mean every start is creating a new zigbee network (thus devices would have to be rejoined on every reboot?) Or does that not matter?

It matters…

The problem is your coordinator definition doesn’t define all these parameters (panId, ePanId, network key etc). This means the binding creates one for your - it would save these, but it can’t as you are using files for your thing definition, so you need to add these parameters or it won’t survive a restart.

Changing my thing definition to:

Thing zigbee:coordinator_ember:stick1 “Zigbee USB Stick” [
zigbee_port=“/dev/ttyUSB1”,
zigbee_baud=“57600”,
zigbee_flowcontrol=2,
zigbee_channel=11,
zigbee_panid= 15224,
zigbee_extendedpanid= “C589E4BB2B3CFB01”,
zigbee_networkkey=“2A71F8C677E8AD9D6BE9CD5F33121FE0”
]

Fixes it, i can reboot and things rejoin. I will have to rebind them all.

If they drop again, I’ll update. But for now, at least on reboots, it’s fixed.

Perhaps Thing file definitions should have required attributes for cases like this that log out errors if they do not exist, will cause issues.

Thank you for the help though, you’re very responsive :slight_smile:

1 Like

In this case, I agree it would have been quite helpful. However, if a user is using the GUI for configuration, then the system expects these to be empty, and it creates random configuration settings and then stores them for next time, so this is actually a normal situation.

Unfortunately the binding has no way to know that the user is defining the configuration in files, or through the GUI. If I change this to log an ERROR, then I will probably get lots of people complaining :frowning: .

This should at least be documented - I’ll add this to the docs…

Thanks.