How to properly replace a broken ZigBee coordinator?

Dear Community,

so far it was not necessary to replace the ZigBee coordinator in my environment, but it may happen any time that the situation comes and not even knocking on my door, but is already in my living room.

I am wondering if there is any built in mechanism or some manual efforts (on the top of ordering a just available, compatible coordinator) to quickly get over the failure caused malfunctioning of the ZigBee network.

This post by @chris lets me think, there is something coming up, perhaps already implemented, but I am not sure if it is a console based solution or (to be) supported by the GUI.
Within the same topic I also read about copying over the network key, but based on other remarks I am not sure if that would be sufficient.

Thanks in advance

This is not a simple task, and us not supported by most coordinators. I have implemented and tested this for the Ember (which is the only system I know that supports this). This is supported by the console.

The issue is the saving of all security information - most coordinators do not provide ALL the information required.

Note that the coordinator doesn’t store any information about devices, so the only issue is security.

1 Like

Thanks @chris for the quick reply.

However I have an ember based coordinator, am not sure if in a critical situation (means, my ZigBee network is out of order) I would be able to quickly retrieve the information and go through the console actions… and if I am not mistaken, the preparation should be run before the coordinator “dies”.
Other enthusiasts, like me would probably not have the right skills to manage this in a timely manner if needed.

i have only crazy thoughts like inserting a second ZigBee dongle and join the network, then unplugging and letting it sleeping in the drawer until needed, perhaps every 2nd month plugging it in again like a DR test. So perhaps it could be a fail over, semi hot spare device
…or…
the mentioned console commands could be running periodically as a backup in case someone is using a compatible coordinator with the binding and when the original coordinator becomes offline while “resting in peace” and a new, compatible one is inserted, the settings could be loaded back automatically or the user be prompted if that is it’s goal with the new coordinator etc.

what do you think?

Of course, this could be done, but it’s not currently implemented. As mentioned, this is only possible with the Ember. Possibly I will add this in future but for now you would need to do it manually.

yes, you did and i was referring to “compatible coordinator” in my thought-giver too

In the zigbee binding documentation I’ve found not much about that, is the documentation located somewhere else?

I do not have a 2nd stick, so obviously cannot do much with that right now, but am eager to read how it works and what steps have to be performed.

There is a simple netbackup command - this will provide a string and if you provide that as a parameter to the netbackup command it will restore the state.

2 Likes

that sounds really simple indeed :slight_smile: and I assume that should be run in karaf console, shouldn’t it?

Yes, exactly. It produces a simple string output - save that and then restore using the same string.

You should also keep an openHAB backup as the overall network is captured in other files. In theory, this isn’t absolutely needed as devices can be rediscovered, but it is best to keep it.

Many thanks, I will adjust my backup plan accordingly.

One more question about netbackup: Is it possible to redirect the output xyzstring to a text file and when restoring, to feed it back?

No - I think it will only work in the console. This was actually written as a debug tool - ideally it should be done a bit differently.

Of course you might be able to redirect the whole console to script it - at least for the backup and hopefully you never have to restore, but if you do, doing it manually should be fine as it’s done very very seldom.

Please can you let me know, how exactly this command needs to be executed.
I have already checked the documentation, but unfortunately I did not make it.

image

Thanks in advance,
Ralf

I am really sorry. In the end I never used that command and temporarily I have suspended my openhab setup due to some other priorities

You need to put zigbee in front of the zigbee commands.

zigbee netbackup

should work. Note that this is only supported with Ember dongles as other dongles don’t provide the facilities to read out the information required to restore a network.

Thanks for answering that fast!

Unfortunately I get a “Command not found” error.
As per documentation the -dongle and -port options are always required. I guess “–help” should work without those inputs.

image

What version of OH are you running? Do you actually have the command line bundles installed? (edit - the fact that the command is highlighted in red would suggest not).

The zigbee command line bundles should be installed by default, along with the binding, in recent versions of OH (I forget how recent, but maybe the past year or so). If you are running an old version, then this might not be installed and you would need to manage this manually by downloading the bundles and adding them to the addons folder.

I was wondering the same thing as the OP as I have had time-consuming problems with CC2531 stability. I’ve had to rebuild my Zigbee network 3 times now in the space of 2 weeks in turn with two CC2531 adapters and one Sonoff Zigbee bridge, first with zigbee2mqtt, then zigpy.

As I’m adding more and more Zigbee devices (I’m convinced that Zigbee is, partly because of this article, is the future for Home Automation device comms), the requirement to not have to access and repair all my active devices is just not practical. And won’t be an acceptable solution for professional installations.

A bit of research gets this from TI where they describe a Zigbee network cloning solution - https://www.ti.com/lit/an/swra671/swra671.pdf
If the coordinator were to fail, new devices would be prevented from joining the existing network since there would be no means of opening the network for joining, or exchanging network and application keys. Therefore, there is a need to back up the information contained in the coordinator and transfer it to new device so that the new device can resume the role as coordinator. By being able to “clone” the device, the established network can continue as is, and new devices are able to successfully join.

As far as I’m aware, it’s not possible to do this with the standard 2531 firmware at least. The problem is you need to read out a lot of security information that is not available. Custom firmware, or maybe very new TI firmware, might solve this - I’m not sure.

The Ember dongles do have the ability to read out this additional information to transfer the trust centre to a different stick.

1 Like

Hi there,
first of all let me explain why I revived this topic:
-I read the docs and I could not find any hint how to solve my problem.
-I searched the internet and could not find any other topic closer to my problem.

But if you prefer I will open up a new topic.

So what’s my problem:
I have the Zigbee Binding up and running I guess around 60 nodes with the Zigbee 3.0 USB Dongle Plus–ZBDongle-E as Coordinator.

But the Coordinator has a few issues, so I ordered the Elelabs Zigbee USB Adapter (ELU013) to replace it with.

I already tried the netbackup command in the consol and I received an output that looks valid to me:

zigbee netbackup 2023-03-05T21:34:59Z>COORDINATOR>3F22>056043XXXXXXXXXX>CHANNEL_11>8A15DB8E644AXXXXXXXXXXXXXXXXX>00>>001BA622>5A6967426565XXXXXXXXXXXXXXXXX>>>00017000

So what I did next is to physically disconnect the “old” coordinator and connect the “new” one. I had to change the port because for some reason it switched from /dev/ttyACM0 to /dev/ttyUSB0.
And the thing was back online - than I entered the console again and entered the command:
zigbee netbackup 2023-03-05T21:34:59Z>COORDINATOR>3F22>056043XXXXXXXXXX>CHANNEL_11>8A15DB8E644AXXXXXXXXXXXXXXXXX>00>>001BA622>5A6967426565XXXXXXXXXXXXXXXXX>>>00017000

But in the thing-code I could see the panid and the extendedpanid did not match the backup no matter what I did - and of course no Zigbee Thing came back online.

Well I know the obvious answer reset every Zigbee device and connect it to the new Coordinator - yeah but maybe there is a smarter way…

Some basic info about my setting:
openHAB 3.4.2 - Release Build
Raspberry Pi 4 Model B Rev 1.5

thanks in advance - let me know if you need further info.

Ok found the first mistake myself - had to add “” to the command:

zigbee netbackup "2023-03-05T21:34:59Z>COORDINATOR>3F22>056043XXXXXXXXXX>CHANNEL_11>8A15DB8E644AXXXXXXXXXXXXXXXXX>00>>001BA622>5A6967426565XXXXXXXXXXXXXXXXX>>>00017000"

But now I get the following error:
2023-03-06 18:15:45.919 [ERROR] [tsystems.zigbee.ZigBeeNetworkManager] - Cannot add extension ZigBeeDiscoveryExtension when network state is ONLINE

Just guessing:

zigbee ncpstate OFFLINE
zigbee netbackup "..."
zigbee ncpstate ONLINE