ZigBee network issues and timeouts. How to tackle devices randomly in wrong status

I am running my installation using a ZigBee controller and devices and I have randomly very regular issues with the device states being ONLINE but not able to control the device or the other way. Status being OFFLINE but me being able to turn on a smart plug:

  • controller: CC2531 stick
  • OpenHab 3.3.0
  • Blitzwolf BW-SHP13 smart plugs with ON/OFF funtionality and measuring energy consumption
  • running on Synology DSM 7.1.1-42962
  • within Docker
  • distance from controller to devices roughly 3 meters

I enabled debug logging for ZigBee related packages in my log4j2.xml:

<Logger level="DEBUG" name="org.openhab.binding.zigbee"/>
<Logger level="DEBUG" name="com.zsmartsystems.zigbee"/>
<Logger level="INFO" name="com.zsmartsystems.zigbee.dongle.ember.internal.ash"/>

1 example issue:
Device is ONLINE in OpenHab and I send the off command to switch the smart plug off. It looks like OpenHab is waiting for the acknowledgement of the off comand and is a facing a TimeOut.
Device address is A4C138DC35975176 in the logs

A4C138DC35975176: Channel zigbee:device:stick:a4c138dc35975176:A4C138DC35975176_1_switch received TIMEOUT in response to OFF

Full export of the log during the off command is below.

How can I tackle this? The distance between the controller and my devices is super low. Devices have power all the time. Is there anything in between the controller and the devices I can check during my debugging?

Is it correct that ZigBee has nothing to do with my WiFi and Internet at home and the internet router? I am just confused after reading other posts.

Full export of the log during the off command:

2022-11-15 09:27:41.268 [DEBUG] [.zigbee.handler.ZigBeeIsAliveTracker] - IsAlive Tracker scheduled task for thingUID=zigbee:device:stick:a4c138a0601c260f in 1830 seconds

2022-11-15 09:27:41.634 [DEBUG] [ng.zigbee.handler.ZigBeeThingHandler] - A4C138DC35975176: Command for channel zigbee:device:stick:a4c138dc35975176:A4C138DC35975176_1_switch --> OFF [OnOffType]

2022-11-15 09:27:41.635 [DEBUG] [e.transaction.ZigBeeTransactionQueue] - A4C138DC35975176: Added transaction to queue, len=1, transaction=ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=0, state=WAITING, sendCnt=0, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=--]]

2022-11-15 09:27:41.635 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction Manager: Send Next transaction. outstandingTransactions=0, outstandingQueues=1, sleepy=0/3

2022-11-15 09:27:41.636 [DEBUG] [transaction.ZigBeeTransactionManager] - 3EFB/1: Sending ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=1, state=WAITING, sendCnt=0, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:41.636 [DEBUG] [transaction.ZigBeeTransactionManager] - transactionListenerAdded: 1 outstanding

2022-11-15 09:27:41.637 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - TX CMD: OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]

2022-11-15 09:27:41.637 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - TX ZCL: ZclHeader [frameType=CLUSTER_SPECIFIC_COMMAND, manufacturerSpecific=false, direction=CLIENT_TO_SERVER, disableDefaultResponse=false, manufacturerCode=0, sequenceNumber=67, commandId=0]

2022-11-15 09:27:41.638 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - TX APS: ZigBeeApsFrame [sourceAddress=0000/1, destinationAddress=3EFB/1, profile=0104, cluster=0006, addressMode=DEVICE, radius=8, apsSecurity=false, ackRequest=true, apsCounter=24, rssi=--, lqi=--, payload=01 67 00]

2022-11-15 09:27:41.638 [DEBUG] [converter.ZigBeeBaseChannelConverter] - A4C138DC35975176: Channel zigbee:device:stick:a4c138dc35975176:A4C138DC35975176_1_switch waiting for response to OFF

2022-11-15 09:27:41.638 [DEBUG] [31.network.impl.CommandInterfaceImpl] - ->  AF_DATA_REQUEST (Packet: subsystem=null, length=13, apiId=24 01, data=FE 0D 24 01 FB 3E 01 01 06 00 24 30 08 03 01 67 00 92, checksum=92, error=false) 

2022-11-15 09:27:41.652 [DEBUG] [31.network.impl.CommandInterfaceImpl] - <-  AF_DATA_SRSP (FE 01 64 01 00 64)

2022-11-15 09:27:41.652 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - RX STA: msgTag=67 state=TX_ACK

2022-11-15 09:27:41.652 [DEBUG] [transaction.ZigBeeTransactionManager] - notifyTransactionProgress: TID=67, state=TX_ACK, outstanding=1

2022-11-15 09:27:41.653 [DEBUG] [zigbee.transaction.ZigBeeTransaction] - Transaction state changed: nwk=3EFB, TID=67, event=TX_ACK, state=TRANSMITTED

2022-11-15 09:27:42.212 [DEBUG] [31.network.impl.CommandInterfaceImpl] - <-- AF_DATA_CONFIRM (FE 03 44 80 E9 01 24 0B)

2022-11-15 09:27:42.213 [DEBUG] [31.network.impl.CommandInterfaceImpl] - Received Async Cmd: AF_DATA_CONFIRM(Endpoint=1, Status=Z_MAC_NO_ACK(233), TransID=36)

2022-11-15 09:27:42.213 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - RX STA: msgTag=67 state=RX_NAK

2022-11-15 09:27:42.213 [DEBUG] [transaction.ZigBeeTransactionManager] - notifyTransactionProgress: TID=67, state=RX_NAK, outstanding=1

2022-11-15 09:27:42.213 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction complete: ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=578, state=FAILED, sendCnt=1, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:42.214 [DEBUG] [transaction.ZigBeeTransactionManager] - transactionListenerRemoved: 0 outstanding

2022-11-15 09:27:42.214 [DEBUG] [e.transaction.ZigBeeTransactionQueue] - A4C138DC35975176: transactionComplete, state=FAILED, outstanding=0

2022-11-15 09:27:42.214 [DEBUG] [e.transaction.ZigBeeTransactionQueue] - A4C138DC35975176: Added transaction to queue, len=1, transaction=ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=579, state=WAITING, sendCnt=1, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:42.214 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction Manager: Send Next transaction. outstandingTransactions=0, outstandingQueues=1, sleepy=0/3

2022-11-15 09:27:42.214 [DEBUG] [transaction.ZigBeeTransactionManager] - 3EFB/1: Sending ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=579, state=WAITING, sendCnt=1, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:42.214 [DEBUG] [transaction.ZigBeeTransactionManager] - transactionListenerAdded: 1 outstanding

2022-11-15 09:27:42.215 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - TX CMD: OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]

2022-11-15 09:27:42.215 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - TX ZCL: ZclHeader [frameType=CLUSTER_SPECIFIC_COMMAND, manufacturerSpecific=false, direction=CLIENT_TO_SERVER, disableDefaultResponse=false, manufacturerCode=0, sequenceNumber=67, commandId=0]

2022-11-15 09:27:42.215 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - TX APS: ZigBeeApsFrame [sourceAddress=0000/1, destinationAddress=3EFB/1, profile=0104, cluster=0006, addressMode=DEVICE, radius=8, apsSecurity=false, ackRequest=true, apsCounter=25, rssi=--, lqi=--, payload=01 67 00]

2022-11-15 09:27:42.215 [DEBUG] [zigbee.transaction.ZigBeeTransaction] - Transaction state changed: nwk=3EFB, TID=67, event=RX_NAK, state=DISPATCHED

2022-11-15 09:27:42.215 [DEBUG] [31.network.impl.CommandInterfaceImpl] - ->  AF_DATA_REQUEST (Packet: subsystem=null, length=13, apiId=24 01, data=FE 0D 24 01 FB 3E 01 01 06 00 25 30 08 03 01 67 00 93, checksum=93, error=false) 

2022-11-15 09:27:42.232 [DEBUG] [31.network.impl.CommandInterfaceImpl] - <-  AF_DATA_SRSP (FE 01 64 01 00 64)

2022-11-15 09:27:42.233 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - RX STA: msgTag=67 state=TX_ACK

2022-11-15 09:27:42.233 [DEBUG] [transaction.ZigBeeTransactionManager] - notifyTransactionProgress: TID=67, state=TX_ACK, outstanding=1

2022-11-15 09:27:42.233 [DEBUG] [zigbee.transaction.ZigBeeTransaction] - Transaction state changed: nwk=3EFB, TID=67, event=TX_ACK, state=TRANSMITTED

2022-11-15 09:27:42.854 [DEBUG] [31.network.impl.CommandInterfaceImpl] - <-- AF_DATA_CONFIRM (FE 03 44 80 E9 01 25 0A)

2022-11-15 09:27:42.854 [DEBUG] [31.network.impl.CommandInterfaceImpl] - Received Async Cmd: AF_DATA_CONFIRM(Endpoint=1, Status=Z_MAC_NO_ACK(233), TransID=37)

2022-11-15 09:27:42.854 [DEBUG] [tsystems.zigbee.ZigBeeNetworkManager] - RX STA: msgTag=67 state=RX_NAK

2022-11-15 09:27:42.854 [DEBUG] [transaction.ZigBeeTransactionManager] - notifyTransactionProgress: TID=67, state=RX_NAK, outstanding=1

2022-11-15 09:27:42.855 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction complete: ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=1219, state=FAILED, sendCnt=2, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:42.855 [DEBUG] [transaction.ZigBeeTransactionManager] - transactionListenerRemoved: 0 outstanding

2022-11-15 09:27:42.855 [DEBUG] [e.transaction.ZigBeeTransactionQueue] - A4C138DC35975176: transactionComplete, state=FAILED, outstanding=0

2022-11-15 09:27:42.855 [DEBUG] [e.transaction.ZigBeeTransactionQueue] - A4C138DC35975176: transactionComplete exceeded max retries 2

2022-11-15 09:27:42.855 [DEBUG] [zigbee.transaction.ZigBeeTransaction] - Transaction cancelled: ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=1220, state=CANCELLED, sendCnt=2, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:42.855 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction complete: ZigBeeTransaction [ieeeAddress=A4C138DC35975176 queueTime=1220, state=CANCELLED, sendCnt=2, command=OffCommand [On/Off: 0000/0 -> 3EFB/1, cluster=0006, TID=67]]

2022-11-15 09:27:42.856 [DEBUG] [transaction.ZigBeeTransactionManager] - transactionListenerRemoved: 0 outstanding

2022-11-15 09:27:42.856 [DEBUG] [e.transaction.ZigBeeTransactionQueue] - A4C138DC35975176: transactionComplete but not outstanding, state=CANCELLED, outstanding=0

2022-11-15 09:27:42.856 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction Manager: Send Next transaction. outstandingTransactions=0, outstandingQueues=0, sleepy=0/3

2022-11-15 09:27:42.856 [DEBUG] [transaction.ZigBeeTransactionManager] - Transaction Manager: Send Next transaction. outstandingTransactions=0, outstandingQueues=0, sleepy=0/3

2022-11-15 09:27:42.856 [DEBUG] [zigbee.transaction.ZigBeeTransaction] - Transaction state changed: nwk=3EFB, TID=67, event=RX_NAK, state=CANCELLED

2022-11-15 09:27:42.856 [DEBUG] [converter.ZigBeeBaseChannelConverter] - A4C138DC35975176: Channel zigbee:device:stick:a4c138dc35975176:A4C138DC35975176_1_switch received TIMEOUT in response to OFF

I’m no expert in ZigBee so if I say something wrong hopefully someone more knowledgable will come along and correct me.

But my understanding it that periodically the binding will poll the devices and if the devices do not respond (and it’s mains powered, battery powered devices respond on their own schedule to preserve battery) it will eventually mark it as offline. So what does this mean?

  1. The device may have gone offline but hasn’t missed enough polling periods for the coordinator to give up.

  2. The device may have returned but there hasn’t been a poll from the coordinator yet to let it know.

Whether or not the device is marked as online, the coordinator will try to send out the command.

So you can’t really rely on that online status to know whether or not a given command is going to work or not.

No. Zigbee is a mesh network. All the devices talk to each other (at least those that are in range). There’s no chokepoint or the like.

Zigbee runs on the same frequency bands as WiFi but it’s a wholly different protocol. They can interfere with each other if WiFi is saturated in your area.

I think it is quite unfortunate that the documentation uses CC2531 as an example (at the top of page).
Scrolling down to "Supported Coordinators"reveals parts of why not.
In addition CC2531 do not handle zigbee v.3 devices well.
In my mind CC2531 simply do not have the horsepower to handle many of the zigbee devices aviable today

Where can I propose a documentation fix?

At the bottom of the page there is a link to “Edit this page on GitHub”.

But that example at the top is not intended nor does it read like an endorsement of the CC2531.

The section of the docs that lists the supported coordinators puts it last. The Serial Port configuration table lists it last. The section ion the docs that explicitly discusses that coordinator states

Not Recommended for new systems

And it discusses all the reasons why. So it seems like the there are sufficient warnings against choosing this coordinator already in the docs. But it’s easy enough to change “cc2531” to “ember” in that simplified example. Even the baud rate can be left alone.

Thanks @rlkoshak you’re nearly right (and the general advice is bang on). Just to clarify, the binding typically doesn’t poll unless it really needs to. The binding configures the device so that it periodically sends reports to the coordinator (ie the binding) so that the binding knows it’s alive. This time is kept quite long to avoid congestion - typically it’s around 1 hour from memory. If the device changes state, it will also report back to the binding, so this ‘1 hour’ time is a backstop - the binding expects the device to report at least that often - more often if the state changes.

So, to manage the offline state, the binding keeps track of these reports. It knows that the device is configured to report with a certain time (let’s assume it’s the 1 hour I mention above). The binding will allow 1 frame to be missed, but if 2 times the reporting period passes without receiving a report, then the binding will try and poll the device, and if that fails, then the device will be set offline.

So, the offline time is basically (2 x “reporting interval” + 30 seconds). In the above example, that’s 2 hours 30 seconds.

As Rich said - you can’t 100% rely on the online/offline state, and this will always be the case no matter what polling we do… There will always be a window of uncertainty between the last poll, and the next poll or reporting period. We can of course reduce this time, but this will have other consequences - generally network latency if devices are not reporting properly.

1 Like

I’m improving. :blush: :rofl:

1 Like

Thanks @chris , @rlkoshak and @NilsOF for all your inputs.

That was also my understanding that a regular poll is being done and the status updated accordingly. But since I had the issue over several hours I was thinking about what @NilsOF also said yesterday. The CC2531 being too low on capacity and unable to properly handle ZigBee 3.0 devices.

Therefore I ordered a “SONOFF Zigbee 3.0 USB Dongle Plus,TI CC2652P + CP2102(N)” yesterday and hope to solve my device problems with it. It will arrive in the next hours.

The problem I see now is that I was using the ZigBee binding directly and after some research it looks like I need to install zigbee2MQTT and Mosquito to connect the Sonoff stick to Openhab. And I have no clue how right now.

I will update here if my issues are gone using the new setup after I hopefully successfully implement it.

My understanding is the Sonoff coordinator uses an Ember chipset and therefore should work with the Zigbee binding just fine.

Oh, I was not aware of that :slight_smile:

I have just revceived the stick and finished setting up Mosquitto as message broker and zigbee2mqtt. I also added the MQTT binding and I am now trying to add the things and items to OpenHab, which does not work yet. I guess I have some issues providing the correct topics to the binding/things.

At least the connectivity between the Sonoff to the devices through Zigbee2MQTT is super fast and seems reliable until now.

Let’s see what happens when I control them over OpenHab.

Anyways, I guess my issues were related to the lame CC2531 stick and it’s restrictions at least for Zigbee 3.0 devices.

It’s not about capacity. Older versions of the firmware don’t manage the Zigbee 3 security, so you’ll find that Z3 devices may not be able to join. Once joined, there’s very little difference between Zigbee 3 and the older HA1.2 standard.

From the log provided above, this is a TI chipset. Sonoff have both TI and Ember options available - the new one @in1t5 ordered is also a TI chipset.

1 Like

FYI, there is currently ongoing development on a new “zstack” driver which will add Z-Stack 3 support needed for the newer Texas Instruments CC2652 and CC1352 based Zigbee Coordinator dongles to work with openHAB’s Zigbee Binding. You can check out this discussion thread specifically about that here (which also contains links to working code that you can help test):

https://community.openhab.org/t/discussion-about-texas-instruments-z-stack-3-0-and-cc2652-and-cc1352-zigbee-coordinator-adapters-with-openhab-zigbee-binding/129385

1 Like