The “default” zigbee firmware on zigbee2mqtt pages I mentioned above is 1.2 HA
I’ve just moved mine to the 3.0.x firmware - it seems to be binding much more easily (firmware available from the same site. I grabbed and used CC2531_20190425.zip, which contains CC2531ZNP-with-SBL.hex)
It also appears to be more stable
HOWEVER: There are some caveats
1: by default Zigbee 3.0 only allows binding for 180 seconds after startup
2: The discovery timeout is still too short at times (busy mesh and/or interference)
3: Initializing aqara devices is problematic - the pressure sensors in particular
4: mesh devices don’t seem to reliably rejoining automatically after a restart
I’ll address these in order. Some of this may be down to impatience on my part but not having a fully “OK” mesh an hour after restarting OH doesn’t seem right.
1: enforced zigbee timeouts
Adding a device outside the 3 minute startup period doesn’t seem to “simply” require going into paperui inbox and scanning as usual.
I’m having to use openhab-cli and issue the following commands if I want to add devices without starting zigbee, restarting the stick (or restarting OH)
(NB: 180 is the default. It can’t be set higher than 199 - this is a zigbee3 security setting)
If the device isn’t added, then you need to start over with those commands until it happens (this is a PITA with Aqaras)
2: Until a device is fully discovered, about the only way to keep things going (especially Aqaras) is to keep firing the discovery command in openhab-cli whilst clicking on their button, until they show back up in HabMin or paperui as “discovered” (if already configured) or in the inbox with their fully identified device name.
NB: “zigbee nodes” will show the device as discovered, online and OK,long before OH interfaces still say it’s working properly.
The time taken to do a discovery is definitely related to distance from the stick (signal strength) and wifi interference. One device latched onto a Tradfri repeater in another room and took more than 20 minutes to become active. That’s a lot of button presses.
3: Aqara sensors usually show up as “unitialized” in paperui or “zigbee_device_initialised false” in Habmin
I’m using paperui to initialise but regularly see this:
[ERROR] [r.ZigBeeConverterAtmosphericPressure] - 00158D0005404958: Error 0xffff setting server binding
[INFO ] [ng.zigbee.handler.ZigBeeThingHandler] - 00158D0005404958: Channel zigbee:device:562cd04e:00158d0005404958:00158D0005404958_1_pressure failed to initialise device
it’s always the pressure sensor that fails to initialise. The odd part is that even with the failure, I’m seeing pressure changes.
[vent.ItemStateChangedEvent] - zigbee_device_562cd04e_00158d0005404958_00158D0005404958_1_temperature changed from 29.73 °C to 29.83 °C
[vent.ItemStateChangedEvent] - zigbee_device_562cd04e_00158d0005404958_00158D0005404958_1_humidity changed from 42.83 to 54.94
[vent.ItemStateChangedEvent] - zigbee_device_562cd04e_00158d0005404958_00158D0005404958_1_pressure changed from 1014.8 hPa to 1014.9 hPa
Before restarting OH, ALL of these devices were online.
“zigbee node N” or “zigbee info N/1” shows they’re actually there, but the only way that they can be convinced to show up as “state online” in the CLI is to power cycle the tradfri lights
shut down for 10+ seconds
they come back into the mesh instantly
more importantly - until you do this, they won’t act as routers
the only way I’ve found that works for the Aqara devices is to re-pair them, which is painful. Just bouncing on the button to force a poll isn’t enough - and popping the battery out doesn’t work either
Even despite this, all these devices are showing as connected in paperui/habmin
There’s one other problem - which I think needs a separate ticket - the Aqara vibration sensors have 2 endpoints in them, but paperui/habmin is only showing the “door lock” one (whatever that is). Endpoint 2 - (device type 5F02) doesn’t show up. Sevice 47266 in the table above
The binding defines this - from memory the binding uses 60 seconds.
The maximum is 254 seconds.
This should not be required - it won’t really do anything with the binding or the device. Pushing the button on the device is what is required to keep things moving. Continuing to request join through the coordinator will potentially slow things down.
Don’t worry about the status here - this has nothing to do with openHAB status.
The binding will show all endpoints that provide clusters it supports. Presumably if there’s a second endpoint it doesn’t provide any features known by the binding. Alternatively it was just not discovered properly.
The binding defines this - from memory the binding uses 60 seconds.
OK, I was working from other docs
The maximum is 254 seconds.
CC3351 3.0 docs point to a 199 limit for this device. I did think it was strange with most limits being defined by binary transitions. Presumably someone in TI just arbitrarily chopping it…
about the only way to keep things going (especially Aqaras) is to keep firing the discovery command in openhab-cli
This should not be required - it won’t really do anything with the binding or the device
Should, but I was finding that just bouncing on the button wasn’t enough
Before restarting OH, ALL of these devices were online.
Don’t worry about the status here - this has nothing to do with openHAB status.
Confusing that it’s in the CLI then. Hopefully anyone in future will see this thread and be less worried
the Aqara vibration sensors have 2 endpoints in them
The binding will show all endpoints that provide clusters it supports.
Let’s setup a separate ticket for this. Right now the endpoint that is being detected shows up as an electronic lock , which isn’t right for a vibration/rotation and temperature sensor
Another FWIW: You’ve said that battery powered ZB devices are supposed to wake up every ~6-8 seconds. We already know that Xiaomi/Aqara devices are out of spec for this, but it looks like they’re out by a factor of 10 - waking up about once every 80 seconds when nothing’s happening if my endpoint query responses are anything to go by
The Aqara devices also report battery voltage and signal quality as part of their reporting (zigbee2mqtt already implements this), It’d be nice to be able to extract this to know when batteries are getting low and keep an eye on network quality etc
I would strongly recommend not doing this. It will cause additional traffic that is likely to delay sending of data needed for the discovery. I think this is one of those things where you do something, and it happens to work, so you inadvertently attribute it to that event.
The CLI is not written for openHAB. It is a general CLI used for testing the low level ZigBee framework - it is just imported into OH for the same purpose. The binding uses different means to decide if a device is online.
If it’s confusing we can remove the CLI?
I can only reitterate what I’ve said. The binding will systematically search through all endpoints looking for services that it can support through a channel. Either the discovery was incomplete, or the device doesn’t support any functions that the binding can use.
That’s not uncommon. There are many different systems employed by battery devices and some are not necessarily respecting the rules…
I believe that battery is reported in a custom cluster so is not supported by the binding - you’d have to ask Xiaomi why they didn’t use standard functions here.
Likewise, the signal integrity is monitored by the binding using standard functions - and not the Xiaomi specific function.
The binding uses different means to decide if a device is online.
If it’s confusing we can remove the CLI?
I think that will cause even more problems that it solves.
A note in the general help that the online/unknown state in the CLI may differ to the GUI should be sufficient
(Today 3 of the Ikea bulbs were showing online in the CLI and offline in the GUI. Sigh… They came back by toggling the devices disabled/enabled in paperui so they can’t have been too far gone but it’s a little worrying - and just to add to the confusion they were showing updates in the logfile despite the GUI thinking they were offline)
[re battery deviecs and long wakeup intervals]
That’s not uncommon.
Is there any way to accomodate this? Ideally whilst flagging that the device implementation is non-compliant but it’s being accepted anyway?
I think the Robustness principle/Postel’s law applies here (RFC 1122)
[yes, I know the counterpoints to this, having contributed to the Internet drafts. Martin Thomson’s point is valid. Flagging noncompliance both helps ensure people don’t make assumptions and puts pressure on suppliers to fix things]
I believe that battery is reported in a custom cluster so is not supported by the binding
I’m not entirely sure what you mean by a custom cluster
The way I read the coverage is that it’s reported as part of the return line for each of the sensors. zigbee2mqtt shows battery status/battery mV/signalquality/sensor for each sensor at every sensor update and it appears to be embedded in there when looking at the raw return string.
Sorry - I don’t actually know what the issue is? Does it matter if it doesn’t wake up every 7 seconds- normally not. It will simply mean you need to manually wake up the device if you want to change the configuration - on these devices, that is normally approximately never, so it should not impact the devices support.
Or do I misunderstand what you mean here?
It’s non-standard - ie not defined by the ZigBee standard. There is a standard way to report battery level, and this device does something non-standard.
This has been discussed previously - unfortunately at the moment it is not something that I personally intend to support as I simply have too much else on my plate so I try to focus on supporting standard ZigBee devices at the moment. If someone else wants to add this support, and it doesn’t compromise the structure of the binding, then that is of course great
Noted regarding busy-ness but if there’s a ticket then someone might look at it.
I assume for the purposes of sorting the things out on a longer term basis that an official xiaomi specification would be the best thing you (or anyone else) could lay your hands on?
It was looked at in the past and discussed here -:
The problem is that this is really non-standard. The binding is designed to handle non-standard attributes, but unfortunately in this case they’ve not even used standard ZigBee concepts, so it would require probably a custom handler which could be done through a new completely binding extension.
The issue is primarily the door/window switches - they’re “dropping out” and being reported as offline after a while (3-6 hours), which makes my noddy scripts using their status go loopy (I’m intending to use these to control radiator valves - open window == no point in attempting to heat the room)
The things rejoin instantly when the status does change from ON to OFF or vice versa. The problem is that in the meantime the magnet sensor is reporting neither ON or OFF
Unlike cheaper magnet sensors which simply report a “state change” when a door/window is opened (nothing when the door/window is closed), these units actually give “on/off” results, which means keeping them online or caching their results is relatively important as a result.
The thermometers are less critical if they drop out - they do rejoin instantly when things change. The big problem I’m encountering at the moment is initializing them - until “device initialised” is true, they simply display NaN in the PaperUI control panel and they’re consistently failing on pressure sensor initialization (they may eventually synchronise but it’s taking anywhere between 5 and 40 minutes of attempts to make this happen and the more sensors there are the harder it seems to becomes (I’m up to 12 of these units in the mesh and likely to want to add another 8)
Understood - but having the document is a reasonable step to understanding how badly they’ve borked things without having to rely on guesswork
My experience dealing with chinese designers is that once you get past the “Why on earth are you crazy enough to want to do that?” barrier they can be extremely helpful (especially when they realise they can usually ship more product by publishing their specs without NDA encumberances, or by moving to “standardised” specs in firmware updates), but they tend to start with a mindset of not thinking about interoperabliity. The number of reinvented wheels you run into is staggering even at companies like Huawei in their network kit (Lots of custom SNMP MIBs here instead of using standard ones, then when customers complain loudly about not being able to use standard NMS software the response is to “use our software”)
Just a reminder that a Thing OFFLINE/ONLINE status is not an accurate mirror of target device status.
A sleeping battery device may be ONLINE, as might one smashed with a hammer.
An OFFLINE Thing can pass along communications (else how would we ever find out a device is alive?)
2 restarts later and the json for the cc22531 in org.eclipse.smarthome.core.thing.Thing.json is now reset to what appears to be factory defaults - and won’t take changes (alterations take in the gui, but they don’t survive a restart)
I’ve lost control of PANID, Extended PANID, channel and mesh update period
Surprisingly the network key is unchanged - but of course with all the above having changed I’ve lost the entire mesh. A re-paired device won’t stay paired after a OH restart (but will survive the stick being pulled out/reinserted, etc whilst OH is running - this is pointing towards a OH issue rather than a device issue.)
Altering the json whilst openhab is stopped doesn’t take either
I’ve never set any manual files for things in /etc/openhab2/ so I’m not sure what’s going on.
Yes but losing the existing sensor status when this happens is breaking other stuff.
Door/window sensors are the kind of items where data may not change for days on end.
Whilst the Things in question reappear instantly when a door/window is opened or closed, the problem is that the presumed current status of that object can’t be read
Why is this a problem though? We normally do not read the state - these device must instead report their state autonomously.
We don’t tend to poll devices - they are configured to report. You don’t want to be polling a device every 10 seconds, or the battery will go flat. Additionally, you don’t want to be polling a door sensor every 10 seconds anyway as the door can be opened and closed in that time, so you’d miss it.
Devices need to report their state through reporting - instantly.
We’d expect an Item state to remain valid in between “stuff happening” e.g. if a battery device periodically reports “closed”, our Item retains “closed” between the reports.
As I understand @Stoatwblr problem, after lots of sidetrack about USB communication and Thing status, they’re now saying Items are not behaving like that? (Though I’ve no idea what they are doing instead.)
You’re right - we have been kind of drifting between issues and maybe it’s worth taking a step back. @Stoatwblr can you describe the issue you’re having? It’s probably best to start a new thread as well if it’s no longer related to this original issue.
Ok, so you’re talking about within openhab - not the binding? So you mean when a device is OFFLINE, you can no longer see the status?
I guess then we need to work out why your devices are going OFFLINE. This discussion started by you asking about the polling of the devices every 6 to 8 seconds (or the devices waking up every 6 to 8 seconds, which to me is the same). That has nothing to do with the online status of a device - we need to work out why your devices are going offline then.
I’d suggest to check the logs to see what is going on. There will be logging in the binding logs about the alive tracker. Let’s see what that says.
Apologies for the confusion - we are changing topics a lot here. I’d suggest to start a new thread, clearly stating what the issue is.
Where o where? You haven’t mentioned Item at all. Item.state is where we expect any presumed status to be held, regardless of any communication churning about. What state do you get instead?
To recap; Things are about managing comms with devices, and know nothing about meaningful data e.g. temperature readings.
That’s the business of channels, interpreting data.
But channels are stateless, passing messages only. Items hold data with duration over time, between messages.