Zigbee Binding - Renaming Thing triggers node discovery

giuseppeiannello · April 29, 2020, 7:51pm

very appreciated!

abol3z · July 23, 2020, 9:13pm

Hi @giuseppeiannello , I am facing the same issue you are facing with devices not joining the network. I have 17 router nodes and 21 end nodes connected now, and connecting the last few end nodes was painfully slow. I am not able to add new devices to the network anymore.
Did the source router firmware improve things for you? I am thinking of switching to it if it is better.
Also, do you know if flashing the new firmware will require re-paring? it seems it doesn’t when using zigbe2mqtt.

Hi @chris , Any idea on this issue?
Also, can a static thing definition for my devices help in reducing the traffic?

chris · July 23, 2020, 9:21pm

Sorry - I’m not sure what this is. I’m currently running a system with 32 bulbs to stress test the system for a customer in Germany and they all joined ok. That said, there will be some updates that will improve the performance during discovery and when commanding lots of devices in large(ish!) networks. This will be a week or two away.

No - it will make no difference.

abol3z · July 23, 2020, 9:40pm

I started facing heavy delay in the last 4 devices. I guess you should try reaching 40+, also end-devices performance might be different.

Any ideas about the source routing firmware?

Looking forward the new improvements .

chris · July 23, 2020, 9:43pm

Why? Do you think it will matter? What in the network changes do you think? Or do you think that the changes I’ve made are unlikely to make a difference? I’m just trying to understand the issue?

abol3z · July 23, 2020, 11:26pm

I’m just reporting my observations since the performance was decreasing until I reached around 35 nodes in total, then it went to really bad, and I have many routers so I think it’s not a limit problem, it is just when I think the performance will drop really bad.

What supported my findings is that according to zigbee2mqtt, the default firmware is good for a max of 30 devices, if your network has more then they suggest going with source route firmware, which I didn’t test yet on openhab’s zigbee binding, but according to zigbee2mqtt users it gave a good performance boost on zigbee2mqtt.

EDIT: as to why I’m not sure if running device discovery each time a scan starts is the cause, or it’s firmware related, but as @giuseppeiannello already tried the source routing firmware and it didn’t help, then it might be an implementation problem.

I am not sure what changes you made are you mentioning here. I am running 2.5.5 version if that’s help.

chris · July 24, 2020, 8:00am

Ok, I thought maybe you had some explanation for the change. The problem really is this is a multidimensional issue. By that I mean it is almost certainly not just the number of devices - it will also be their location and distribution within the network.

What I’ve been testing is 32 devices in the same room - commanded at the same time. As I reported above, there are some changes coming that will help with discovery and network startup - this might be the issue that you are seeing - I don’t know without knowing what your issue actually is.

If the source route firmware requires the host to manage the routing, then this will not work for you as the binding does not do this at the moment. However I’m just guessing here as I don’t know what this firmware is, and I don’t typically work with the CC2531.

giuseppeiannello · July 24, 2020, 8:40am

@abol3z I can confirm that even in my case the overall performance drops over 30 nodes. This happens both with openhab and zigbee2mqtt

I’ve managed to build a custom firmware a while ago, but the whole network collapses when reaching 40 nodes (bulbs, temperature sensors, motion detectors, smoke detectors, contact sensors).

Source routing (which shouldn’t require any change on the host) doesn’t really make a difference to me, and I can’t go over 40 nodes anyway.

I’ve been looking at a replacement for CC2531 for quite a while, and I just found a device that ticks all the boxes - I’m still waiting for it to be shipped, but the stick from https://elelabs.com/ seems to have a recent and decent microcontroller, and updateable firmware.

chris · July 24, 2020, 9:04am

I guess it’s different for different coordinators and I’m not familar with the 2531. Presumably you have to compile the firmware code to handle large routing tables if you don’t do it on the host? I suspect at least in the Ember, doing this would cause the NCP to run out of memory - the 2531 is only an 8bit processor (8051 I think?) and has less memory than the ARM (I think). Is that what you’re doing though?

This is the Ember NCP that I use and nearly all systems I support commercially are using.

giuseppeiannello · July 24, 2020, 9:16am

Is that what you’re doing though?

The TI firmware has an compile-time flag to enable source routing, and yes it requires more memory (and that’s why it only supports 5 direct children). But I’m just a tinkerer fiddling with hardware, so I might miss the small details.

This is the Ember NCP that I use and nearly all systems I support commercially are using.

I remember an old post, where you’ve been waving the possibility of developing/distributing a stick that solves the problem of the existing commercial alternatives (CEL using an obsolete chipset, others - Qivicon IIRC - not having a bootloader and being stuck to the factory firmware).

As the “which adapter” issue is still a thing in the openhab community, and most people have problems because they end up buying the cheap CC2531, would it make sense to make the documentation more explicit?

chris · July 24, 2020, 9:26am

Ok, thanks - just interesting to know. One of the issues I struck with the 32 bulb test I did is with routing table size so this might be a solution at some point, but it would likely be something I would do on the host to avoid issues with memory - otherwise you always have a constraint - 30, 40, 100 - somewhere it will happen since these microcontrollers have limited memory and you end up trading different constraints - you want them all (number of children, number of nodes/routes in the network, number of outstanding frames, etc…) but you can’t have everything due to memory constraints.

Yes, I have done this - well, I’ve produced a stick and had some manufactured, but getting CE approval etc is too costly really for the number I will likely sell (I’m wondering if elelabs maybe skip this as I don’t see an CE statement ). My stick does a little more in that it also acts as a sniffer at the same time by using some information from Silabs that is not generally available (it’s under NDA).

I’m happy to accept updates to the docs but I’m not sure what you really refer to? More explicit in what respect? Currently the binding supports 4 dongles - they probably all have pros and cons - as everything in life If you think something should be clearer, then please feel free to update it.

giuseppeiannello · July 24, 2020, 9:42am

Ok, thanks - just interesting to know. One of the issues I struck with the 32 bulb test I did is with routing table size so this might be a solution at some point, but it would likely be something I would do on the host to avoid issues with memory - otherwise you always have a constraint - 30, 40, 100 - somewhere it will happen since these microcontrollers have limited memory and you end up trading different constraints - you want them all (number of children, number of nodes/routes in the network, number of outstanding frames, etc…) but you can’t have everything due to memory constraints.

Going full Host-NCP? Wouldn’t that be very hard to maintain, compared to just using EZSP/TI stack? How long before the stick becomes purely a radio interface, and everything else is done on the host?

My stick does a little more in that it also acts as a sniffer at the same time by using some information from Silabs that is not generally available (it’s under NDA).

Is that a feature that could be added via firmware changes only, or does that require also hardware changes? I understand if you can’t answer the question

I’m happy to accept updates to the docs but I’m not sure what you really refer to? More explicit in what respect? Currently the binding supports 4 dongles - they probably all have pros and cons - as everything in life If you think something should be clearer, then please feel free to update it.

More explicit in actually recommending a specific model. The current documentation is very “clinical” in this respect, and it doesn’t really help clarifying the gory details for the casual/intermediate user. I consider myself a power user, and struggled to take a decision on the stick, and ended up in a tunnel of pain, trial and error, and frustration with the CC2531.
Sure, all have pros and cons, but a few of them have big limitations that should be fully understood, and others are more future-proof, even if more expensive (but TBH 20€ for a stick, even for an hobbyist, is still an affordable price, compared to the cost of all the other devices that are part of the network).
I would be happy to PR the docs, I’m just concerned it will look like an advertisement for a specific company

chris · July 24, 2020, 10:00am

I don’t think so. The NCP sends the routes to the host. The host “just” needs to store them in a table, and send the appropriate one back before (or with) each packet it sends. I’ve not looked at this in detail yet, but I think that’s all there is to it

Well, for the coordinator, there’s a lot to be said for this as networks get larger, but the NCP will still be available standalone as most devices are not coordinators.

I can answer the concept and no - it’s a hardware change. With the standard NCP you can get some of the information as a sniffer, but what this does is to get all the data that the coordinator receives, as it receives it. The problem with a separate sniffer is you can never be sure if the device receives a command, or if the sniffer is receiving all the data - with this system it taps into the radio inside the NCP so you see exactly what is sent/received.

Sure - I’m happy if we want to add specific dongles to the list - eg the elelabs one. I think we already mention the Qivicon one (IIRC?).

Understood - I think if we keep it simple it should be fine - as we do already with supported devices.

Thanks.

giuseppeiannello · July 24, 2020, 10:07am

Just looked at the current docs - EFR32 is not mentioned. I’ll open a PR as soon as I get the new sticks and I can test them with the different firmwares.

Is your zigbee library compatible with EZSP v8, or should I stick to v7?

chris · July 24, 2020, 10:08am

Yes. I use this here with firmware version 6.7.

abol3z · July 24, 2020, 10:28am

On zigbee2mqtt also? I’ve seen people on zigbee2mqtt forums with 40 nodes stable but didn’t try it myself.

giuseppeiannello · July 24, 2020, 11:43am

On zigbee2mqtt also? I’ve seen people on zigbee2mqtt forums with 40 nodes stable but didn’t try it myself.

It seems to depend on the devices you have. Some routers are not very good at routing, and this has an impact on network performances too.

zigbee2mqtt is also less chatty on the network, having a devices database. And we go full circle now and go back to the original topic: renaming things triggers a node discovery, and that kills performances on CC2531 with ~20 nodes (with and without source_routing)

abol3z · July 24, 2020, 11:50am

Agree. It seems to be a mix of multiple problems.
One last question, and sorry for that

I’ve seen that you tried the configs mentioned in this post, are they the same as the source routing firmware, or do they improve more?

I want to go with a solution that can help me complete my installation of 40 nodes until we have the new improvements on the binding.

chris · July 24, 2020, 12:13pm

This is a double edged sword. The database is a pain to administer, and it requires devices to be in the database before they work if you rely on it. This is the situation we have with ZWave and what I was trying to avoid with ZigBee. These protocols are self describing so no database should be needed.

The chattyness should only an issue during the very first discovery - this is when the data is downloaded from the device, and after this it is stored in a local database. It should not be re-requested each time discovery is started (in general) but there is some additional traffic that occurs at this time.

It is hard really to know what your problem is, so I cannot be sure what the solution will be and can’t say that the changes I’m making will help for sure. As I mentioned above, there are many issues here - each may require a different solution. There are certainly some issues in the framework at the moment with large networks - multiple broadcasts cause a problem, and if you try and command lots of devices at once, there are problems. This is only tested with the Ember NCP though - maybe these issues don’t exist with the CC2531 - I don’t know. Also depending on the network topology, there could be other issues - it’s quite complex and a lot of this is hidden because it is handled by the coordinator firmware.

giuseppeiannello · July 24, 2020, 12:20pm

My custom firmware is as following:

stock ZNP-Pro-Secure-Standard 1.2.2a
patch for source_routing coming from a post in the TI forum
source_routing patches from z2m
source_routing patches from Optimizing the cc2531 firmware for OpenHAB and the Zigbee binding

I’m adding devices to the network very slowly - 2/3 at a time and testing the network for ~1 week between changes. I’m currently at 22 devices.