KNX bridge keeps loosing communication

Hello community,

I’m a total n00b at OH2/Openhabian/Raspberries and the like, so be gentle.

I have a knx system running including a Gira X1. Due too the limited possibilities of the Gira X1 I started looking into other options and decided to go for a RasberryPi 3B+ running Openhabian and OH2.

I managed to set up OH2, including things and items. I haven’t explored beyond switching some light and logging values from the KNX system such as temperatures and the position of heating actuator valves. That is working fine, no lag or delay. So far so good.

Unfortunately the bridge keeps turning off-line without any apparent cause (to me at least). After some searching within this community and going trough several treads I managed to put some extra items to be logged, such as debug and trace. Below some of the entries in the log.

2018-12-23 17:14:16.930 [hingStatusInfoChangedEvent] - 'knx:ip:X1' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): server request

2018-12-23 17:14:16.982 [ERROR] [net/IP Tunneling 192.168.178.12:3671] - establishing connection failed, null

2018-12-23 17:14:16.984 [DEBUG] [nx.internal.client.AbstractKNXClient] - Error connecting to the bus: null

java.lang.InterruptedException: null

2018-12-23 17:14:17.063 [hingStatusInfoChangedEvent] - ‘knx:ip:X1’ changed from OFFLINE (COMMUNICATION_ERROR): server request to OFFLINE (COMMUNICATION_ERROR)

2018-12-23 17:14:17.092 [hingStatusInfoChangedEvent] - ‘knx:ip:X1’ changed from OFFLINE (COMMUNICATION_ERROR) to ONLINE

If I make a small change in the .things file and when I hit save, the bridge and the associated things come back on-line immediately. Sometimes some things will show offline in the PaperUI but the item associated is still working. This is mentioned in other treads also but is said to have no influence on the workings of OH2.

I’ve been playing around with some of the non-mandatory settings for the bridge such as autoreconnect and localIp but nothing seems to cure this problem for more then a couple of hours at most. Yesterday I restarted the Gira X1 and the error didn’t return for half a day but this morning the whole system was stopped again. However I’ve also had 30 minutes between two off-line errors appearing. There is no clear pattern to be observed yet.

Hereby the relevant section of my .things file:

Bridge knx:ip:X1 "GIRA X1 KNX/IP GATEWAY" @ "KNX Schakelkast" 
[ 
    type="TUNNEL",
    ipAddress="192.178.168.12", 
    portNumber=3671,
    //localIp="192.178.168.16"
   // readingPause=50, 
   // responseTimeout=10, 
    //readRetriesLimit=3, 
    autoReconnectPeriod=10
   //localSourceAddr="1.1.251" //X1 is 1.1.250, tunnels are 251, 252, 253 and 254//
]

The Gira X1 is setup via ETS5. It has a main address at 1.1.250, and 4 host addresses 1.1.251 to 1.1.255. The Gira X1 app (iOS) is working fine all the time. Programming via ETS5 of knx actuators and such is also no problem. In that respect the X1 is behaving as an IP interface to KNX TP with no problems

Can anybody show me a direction to look for possible errors or items to check? Thanks in advance.

One possible??? root-cause could be that your iOS app (or ETS) is conflicting with your OH2 KNXv2 binding with regards to the client Individual Address (IA) (e.g. both using 1.1.251). This is the localSourceAddr in your .things file.

Maybe try to set statically OH2 to localSourceAddr="1.1.254" (the last one available). Or localSourceAddr="0.0.0".

I don’t know if you can check the IAs that the GIRA X1 has already released to tunneling clients using its interface.

From the GIRA X1 manual I see that it supports Routing also. Did you try to configure OH2 KNXv2 binding to use type=ROUTER ?

Whoa that’s a fast response :+1:.
I’ve tried various address at the localSourceAddr= but non gave any permanent improvement. That was one of the thoughts I had, is there an other service using the same tunnel but I haven’t figured out yet how I can find out what service is using which tunnel.

Something that came to mind just now is if IPv6 can have something to do with this? As I pointed out in my opening post, I’m a n00b on everything to do with computer network things(last computer learning stuff was in the mid '80, when we had to program in iso-Pascal at Nautical College). I do now that my ISP has given me an IPv6 on my current home address and that the router is giving out IPv4 and IPv6 IP addresses, I can see in my router/modem that the RB3B+ and the Gira X1 have both an IPv4 and an IPv6 address.

However I will try to run the router option as I wasn’t aware that the X1 was able to do routing (which manual did you find that in?)

I don’t think so. the binding will use IPv4 to establish the tunnel.

https://www.gira.com/en/service/download/download.html?id=3330

1st (tech doc) and 3rd (install) refer to:

Supported protocols DHCP, AutoIP, TCP/IP, UDP/IP
(Core, Routing, Tunneling, Device Management),
ARP, ICMP, IGMP

Note: Routers usually have a IA of: x.y.0
you may have to reconfigure your X1 to comply with this. try with its current IA of 1.1.250 and if it doesn’t work you may need to play with ETS to make it work like a router

Tried the router way with the localSourceAddr= set to IA 1.1.250 of the Gira X1but no success. Then I tried to set the X1 as a router by giving it a IA ending with zero (1.1.0) but couldn’t make it happen via ETS. Looked into the problem and the more I read the more I’m doubting that the X1 is actually a KNX IP router despite what the manual is stating. I have to investigate this a bit more. To be continued after Xmas. Happy holidays to all reading this.

I was unable to place the Gira X1 in a typological location with an IA ending in 0 (x.y.0). The attempt ended in such a chaos that I decided to start the implementation of the X1 in my KNX system from scratch as well as a fresh installation of Openhabian and OpenHab on the RaspB 3B+.

In ETS5 I changed the the option for the connection to be disconnected after use from checked to unchecked. So the connection is not disconnected after usage. I hoped that this would prevent ETS to use every time a different tunnel including the one that OpenHab was using as I was under the impression that somehow, something was “pushing” Openhab out of it’s connection in favour of it own connection.

I also did a factory reset of the X1 the hard way (disconnecting power and 2x pushing the programming button, see manual of the X1).

Furthermore I altered the bridge setting so that the things file only used the bare minimum, that is only type and IP address, all other item are commented out.

All in all this seems to have solved the problem. The error log does show that it is unable to get a response of 1 Internal Address (IA). The error message contains the following:

2019-01-08 20:21:45.758 [WARN ] [calimero.link.192.x.y.z:abcd ] - negative confirmation of 1.1.4: 2e009160110411040080

The number in the end is changing with each log entry.

This IA (1.1.4) happens to be one of the four “pseudo”-IA’s that the X1 adds when giving the IA of the X1 itself (1.1.1) and this “pseudo”-IA is apparently one of the four connection tunnels (1.1.2, 1.1.3, 1.1.4 and 1.1.5) that are available with the X1.

I played around a little with the out-commented settings of the bridge in the things-file but all changes let to loosing the connection, or, and that was specially when entering a local source address containing one of the three remaining IA’s, in a very slow response when switching lights or other equipment connected to the KNX bus system.

The only thing that shows up ones in a while in the log is the error that Influx is unable to write data to the Influx-DB. I’m however not sure if that is related to my initial problem with the KNX bridge.

Anyhow so far so good. If things change or turn out for the worse I will report that here.

Final got a change to start all over with Openhab and “discovered” what might have been causing to dropped communication.

When reviewing my text based things file I noticed that the KNX adress was not always correct. What through me off however was that the reading of certain items was correct. To clarify: the address of several light switches was not correct but the temperature readout from the sensors inside those light switches was possible and correct. So there was an overflow of error messages on not able to reach knx addresses every time those switches were polled. Since I have entered the right addresses I have not loosed communication once. However I also changed from only wired to wired and wifi on the RPi3B+, so that might also have something to do with it. Fortunately all is working as expected and I’m now tinkering wit HabPanel.