Problem with ZigBee Telegesis

Tags: #<Tag:0x00007f5c8efe96a0>

The same is true at my site:
Two motion sensors (z-wave, unrelated) trigger an event which switches a group of (two) zigbee lights on an off. Quite often, i see something like this in the log(s) when lights are switched off and not that often when switched on:

==> /var/log/openhab2/events.log <==
2020-05-18 22:12:04.212 [ome.event.ItemCommandEvent] - Item 'g_L_Sz_Bett' received command 0,0,0
2020-05-18 22:12:04.241 [ome.event.ItemCommandEvent] - Item 'Lam_Sz_BettLi_Farb' received command 0,0,0
2020-05-18 22:12:04.256 [ome.event.ItemCommandEvent] - Item 'Lam_Sz_BettRe_Farb' received command 0,0,0
==> /var/log/openhab2/openhab.log <==
2020-05-18 22:12:04.914 [WARN ] [zsmartsystems.zigbee.ZigBeeExecutors] - Uncaught exception in thread TelegesisCommands-thread-16
java.lang.NullPointerException: null
	at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) ~[?:1.8.0_252]
	at com.zsmartsystems.zigbee.dongle.telegesis.ZigBeeDongleTelegesis.telegesisEventReceivedRunnable(ZigBeeDongleTelegesis.java:776) ~[?:?]
	at com.zsmartsystems.zigbee.dongle.telegesis.ZigBeeDongleTelegesis$2.run(ZigBeeDongleTelegesis.java:752) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_252]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_252]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
==> /var/log/openhab2/events.log <==
2020-05-18 22:12:04.918 [vent.ItemStateChangedEvent] - Lam_Sz_BettLi_Farb changed from 39,100,35 to 39,100,0
2020-05-18 22:12:04.935 [GroupItemStateChangedEvent] - g_L_Sz_Bett changed from 0,0,0 to UNDEF through Lam_Sz_BettLi_Farb
==> /var/log/openhab2/openhab.log <==
2020-05-18 22:12:04.923 [WARN ] [zsmartsystems.zigbee.ZigBeeExecutors] - Uncaught exception in thread TelegesisCommands-thread-17
java.lang.NullPointerException: null
	at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) ~[?:1.8.0_252]
	at com.zsmartsystems.zigbee.dongle.telegesis.ZigBeeDongleTelegesis.telegesisEventReceivedRunnable(ZigBeeDongleTelegesis.java:776) ~[?:?]
	at com.zsmartsystems.zigbee.dongle.telegesis.ZigBeeDongleTelegesis$2.run(ZigBeeDongleTelegesis.java:752) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_252]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_252]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]

It is not related to the system state (coming up or down that is).

Please can you provide a debug log? Is this easily repeatable?

i’ll try to provide some valuable info. However i’m not an expert. So let me know, if you need anything else / different …

raspi 3B+
openhabianpi
openhab 2.5.5

### item definitions
    Group:Color     g_L_Sz_Bett        "Gruppe Bettlicht im Schlafzimmer"                          (g_L)
    Group:Dimmer    g_L_Sz_Bett_FarbT  "Gruppe Farbtemperatur Bettlicht"                           (g_L)
    // LED Ketten unter dem Bett
    Color       Lam_Sz_BettLi_Farb  "Bettlicht links Farbe"                     <colorLight>    (g_Z_Sz,g_L_Sz_Bett)        ["Lighting"]            {channel="zigbee:device:04000B88:ccccccfffebb12f0:CCCCCCFFFEBB12F0_1_color"}
    Dimmer      Lam_Sz_BettLi_FarbT "Bettlicht links Farbtemperatur [%d %%]"    <colorLight>    (g_Z_Sz,g_L_Sz_Bett_FarbT)  ["Lighting"]            {channel="zigbee:device:04000B88:ccccccfffebb12f0:CCCCCCFFFEBB12F0_1_colortemperature"}
    Color       Lam_Sz_BettRe_Farb  "Bettlicht rechts Farbe"                    <colorLight>    (g_Z_Sz,g_L_Sz_Bett)        ["Lighting"]            {channel="zigbee:device:04000B88:ec1bbdfffe271435:EC1BBDFFFE271435_1_color"}
    Dimmer      Lam_Sz_BettRe_FarbT "Bettlicht rechts Farbtemperatur [%d %%]"   <colorLight>    (g_Z_Sz,g_L_Sz_Bett_FarbT)  ["Lighting"]            {channel="zigbee:device:04000B88:ec1bbdfffe271435:EC1BBDFFFE271435_1_colortemperature"}
### rules
val String filename = "bettlicht.rules"
var Timer sys_start_timer = null

rule "Bettlicht - System started"
when
	System started
then
// let openHAB settle and give other rules the chance to fire up
    if (sys_start_timer !== null) return;

    logInfo(filename, "Bettlicht Initialisierung")
    sys_start_timer = createTimer(now.plusSeconds(120), [|
        g_S_Bew_Sz_Bett.sendCommand(OFF)
    //    g_L_Sz_Bett.members.forEach[ i | i.sendCommand("0,0,0") ]
        g_L_Sz_Bett.sendCommand("0,0,0")
    ])
end

rule "Bettlicht"
when
    Item g_S_Bew_Sz_Bett changed
then
    if (g_S_Bew_Sz_Bett.state == OFF) {
        logInfo(filename, "Keine Bewegung mehr, Bettlicht aus")
            g_L_Sz_Bett.sendCommand("0,0,0")
    }
    else if (g_S_Bew_Sz_Bett.state == ON) {
        logInfo(filename, "Bewegung, Bettlicht an")
            g_L_Sz_Bett.sendCommand(35)
            g_L_Sz_Bett_FarbT.sendCommand(80)
    }
end

The problem only occurs, when one of the rules fires up (repeatable). I can’t repeat it sending a single command to a single item via PaperUI. So there might be something wrong with my handling of groups at the end. Anyway: debug logs …

### after log:set ... like shown in the binding manual i get entrires like this every second
### seems to be normal, does'nt it?
2020-05-19 12:27:56.932 [DEBUG] [gesis.internal.TelegesisFrameHandler] - TX Telegesis queue: 1

2020-05-19 12:27:56.935 [DEBUG] [gesis.internal.TelegesisFrameHandler] - TX Telegesis: TelegesisDisplayNetworkInformationCommand [device=null, channel=null, power=null, panId=null, epanId=null]

2020-05-19 12:27:56.938 [DEBUG] [gesis.internal.TelegesisFrameHandler] - TX Telegesis Data:AT+N?

2020-05-19 12:27:56.972 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis Data:+N=COO,11,-07,6981,F59B4F9B118D72DC

2020-05-19 12:27:56.975 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis Data:OK

2020-05-19 12:27:56.979 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis: TelegesisDisplayNetworkInformationCommand [device=COO, channel=11, power=-7, panId=27009, epanId=F59B4F9B118D72DC, status=SUCCESS]
### start of debug.log
2020-05-19 12:47:06.770 [vent.ItemStateChangedEvent] - Sen_Sz_BettLi_Bew changed from OFF to ON

2020-05-19 12:47:06.773 [GroupItemStateChangedEvent] - g_S_Bew_Sz_Bett changed from OFF to ON through Sen_Sz_BettLi_Bew

2020-05-19 12:47:06.777 [GroupItemStateChangedEvent] - g_S_Bew changed from OFF to ON through g_S_Bew_Sz_Bett

==> /var/log/openhab2/openhab.log <==

2020-05-19 12:47:06.804 [INFO ] [arthome.model.script.bettlicht.rules] - Bewegung, Bettlicht an

==> /var/log/openhab2/events.log <==

2020-05-19 12:47:06.820 [ome.event.ItemCommandEvent] - Item 'g_L_Sz_Bett' received command 35
...

Log while switching the ligths on and off via rule. you should find the problem twice.

debug.log (222.5 KB)

I’ve split this to a new topic as the problem is not the same as the previous one. Please can you tell me what version of the binding and telegesis stick you are using. and what the device is.

Sorry for butting in the other thread. I thought … well …

Binding version is 2.5.5 (what oh PaperUI bindings tell me)

The stick is a black box without any label. OH PaperUI Config Things … tells me:
Telegesis ETRX3 Dongle
usb_product_id 33427.0
usb_vendor_id 4292.0

Current firmware version: R309C (unknown)

If this is not the info you need, maybe you can let me know how to find it.

Thanks - that’s fine. One last question that I added late is what is the device that causes this error (ie the light, or switch)?

From what I can see, the received frames are corrupt -:

2020-05-19 12:47:30.510 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis Data:RX:0598,0104,01,01,0006,05:ü

,-58,FF

2020-05-19 12:47:30.512 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis: TelegesisReceiveMessageEvent [ieeeAddress=null, networkAddress=1432, profileId=260, destinationEp=1, sourceEp=1, clusterId=6, messageData=08 FC 0B 00 00, rssi=-88, lqi=255]

2020-05-19 12:47:30.514 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis Data:ACK:F9

2020-05-19 12:47:30.516 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis: TelegesisAckMessageEvent [messageId=249]

2020-05-19 12:47:30.518 [DEBUG] [gesis.internal.TelegesisFrameHandler] - RX Telegesis Data:RX:0598,C05E,01,0,

2020-05-19 12:47:30.520 [DEBUG] [gesis.internal.TelegesisEventFactory] - Error creating instance of Telegesis event

This last frame from the stick is corrupted - it only includes half of the information that’s needed, and this causes the decoder to throw an exception.

I’m not really sure why this is happening - possibly it’s an issue with the serial port. I don’t think this is likely to be an issue with ZigBee itself - if the zigbee frame was corrupted it would simply be dropped by the Telegesis stick, but there is no error detection on the Telegesis frames that come into the binding, so we can’t detect and process errors here (which is why we get the exception).

I know some people with PIs have had issues with USB ports and have needed to add a hub in between to avoid problems. I’m not sure if it could possibly be related to that.

It’s a light. More concret:
plug-in power supply + control unit + led strip
Müller Licht tint 44435

The tint products are said to be zigbee 3.0 compliant. I am unable to find more usefull info about the controler shown below.

Thanks.

Note that this may cause problems - depending on the exact implementation since the Telegesis is NOT ZigBee 3.0 compliant and doesn’t support the ZB3.0 join procedure. This will likely mean that some devices are not compatible and you will likely have problems joining them.

Ahh, you already found something. Thanks a lot for digging out!

I run a RFXCOM and the ETRX3 on USB and a z-wave gpio board. No other serial problems so far on pi3 (complete other story for pi4 … :frowning: )

The more i try on other single items (not groups) without errors, i think this might be some group related problem. I’ll check rules without using groups later.

Do you think that sending data to the stick quite fast one set after another could cause such a problem?

Anyway: thanks for helping to get some clarification!

About the Stick: any other you would prefer? It’s the one i started with quite happy some time ago. And the LED-Strip was the last thing i included (other are HUE).

I investigated a bit more but did not find much. Maybe one of the controlers of the lights sends broken frames.

I’ll check and let you know.

I get the same exceptions using the same Telegenesis USB Stick.

All my Zigbee devices are Samsung Smartthings. I am not sure which are causing the exceptions as everything works.

* today at 5:32 PM  17:32:13.110 [WARN ] [zsmartsystems.zigbee.ZigBeeExecutors] - Uncaught exception in thread TelegesisCommands-thread-72
* today at 5:32 PM  java.lang.NullPointerException: null
* today at 5:32 PM  17:32:20.291 [WARN ] [zsmartsystems.zigbee.ZigBeeExecutors] - Uncaught exception in thread TelegesisCommands-thread-73
* today at 5:32 PM  java.lang.NullPointerException: null
* today at 6:07 PM  18:07:05.055 [WARN ] [zsmartsystems.zigbee.ZigBeeExecutors] - Uncaught exception in thread TelegesisCommands-thread-74
* today at 6:07 PM  java.lang.NullPointerException: null
* today at 6:07 PM  18:07:05.056 [WARN ] [zsmartsystems.zigbee.ZigBeeExecutors] - Uncaught exception in thread TelegesisCommands-thread-75
* today at 6:07 PM  java.lang.NullPointerException: null

As described above, the Telegesis dongle does not have ANY error checking on the data that it sends to the binding. So the binding will simply trap the exception if there is an error.

I’m not sure if this is your issue without more information though. This dongle is used by a major commercial company (with the binding) so I don’t expect major systematic issues).

I took some time to get a better understanding of the problem. The result for now is a list of potential sources and possible solutions:

  1. tint controler - may send corrupt / incomplete data
    -> some checks with single controler (two available)

  2. tint controler - may send some zigbee 3.0 specific frames causing trouble on telegesis
    -> try other usb-stick (bitron with ember chipset availble)

  3. zigebee - over the air mail fail (distance tr/rx to long)
    -> try smaller distance

  4. telegesis - might not be able to receive data fast enough
    -> unlikely, would be a well known problem

  5. telegesis - might produce incomplete frames from correkt data -> broken stick
    -> try other usb-stick (see above)

  6. serial communication - might be to slow to transport data (remembering good old fidonet times i don’t think so with baud set to 19200)
    -> try higher bitrate (config telegesis + openhab/port)

  7. usb port - might have problems, parts of datasets are lost / corupted
    -> try usb-ub
    -> try on other raspi

If i get @chris right, 2) and 7) are the “favorites”.

Although i can live with the logmessages, i’ll try to check the scenarios above. If there is something reasonable missing oder obsolete in the list, let me know. I’ll let you know the results of the tests, when i had time to setup one or another scenario.

Thanks.

My system is operating perfectly, so I will ignore it for now and change the log level.

Just after doing an update via openhabian-confg on my PI 3 B+ (to oh master & stable) -> stable, i took some time and had a look into 7).

First i changed the usb port of the telegensis. Again RX problem like before (as far as i get it right). See ev-7-1.log and oh-7-1.log
oh-7-1.log (152.6 KB) ev-7-1.log (2.3 KB)

Then i gave an usb hub a try. The result seems to be quite the same. See ev-7-2.log and oh-7-2.log
oh-7-2.log (162.0 KB) ev-7-2.log (2.4 KB)

There is no other raspi available atm.

As i am able to switch the lights despite errors / warnings and testing takes quite a lot of time, i am not shure if i will invest more time to understand the issue.

So if you are interested in me testing one of the above scenarios, give me trigger and we’ll see.