KNX Binding looses connection, no reconnect

Hi,

yesterday openHAB stopped sending or receiving data to the KNX bus. The only trace of an error is the following line in the log file:

2019-03-03 23:11:18.752 [ERROR] [Xnet/IP Tunneling 192.168.10.15:3671] - establishing connection failed, null

No other error or messages concerning reconnecting to the bus where logged. All other KNX devices seemed to work fine and after a restart openHAB worked again.

A very long time ago a had a similar problem, but when I remember it right, that was still with the 1.x binding. Currently I am using the 2.4 release version.

In particular I am wondering, why no attempts to reconnect to the bus where made. Here is my thing configuration and I assumed, that a reconnect should have been tried after 30s:

Bridge knx:ip:bridge [ 
   ipAddress="192.168.10.15", 
   portNumber=3671, 
   type="TUNNEL", 
   readingPause=50, 
   responseTimeout=10, 
   readRetriesLimit=3, 
   autoReconnectPeriod=10,        //1, 30
   localSourceAddr="0.0.0"     //1.1.235

Hi, I’m having the same problem now for the second time (after 7 days):

2019-03-09 01:11:43.374 [INFO ] [net/IP Tunneling 192.168.0.53:3671] - close connection - server request
2019-03-09 01:11:43.442 [INFO ] [calimero.mgmt.TL 192.168.0.53:3671] - attached link was closed
2019-03-09 01:11:43.472 [INFO ] [calimero.mgmt.MC 192.168.0.53:3671] - attached link was closed
2019-03-09 01:11:43.473 [INFO ] [calimero.mgmt.TL 192.168.0.53:3671] - attached link was closed
2019-03-09 01:11:43.475 [INFO ] [calimero.mgmt.MC 192.168.0.53:3671] - attached link was closed
2019-03-09 01:11:43.476 [INFO ] [.process.process 192.168.0.53:3671] - attached link was closed
2019-03-09 01:11:43.488 [INFO ] [.process.process 192.168.0.53:3671] - detached from 192.168.0.53:3671
2019-03-09 01:11:43.492 [INFO ] [o.device.process 192.168.0.53:3671] - detached from 192.168.0.53:3671
2019-03-09 01:11:43.496 [INFO ] [net/IP Tunneling 192.168.0.53:3671] - establish connection from /192.168.0.40:39000 to /192.168.0.53:3671
2019-03-09 01:11:43.499 [INFO ] [calimero.link.192.168.0.53:3671   ] - link closed
2019-03-09 01:11:43.498 [ERROR] [net/IP Tunneling 192.168.0.53:3671] - establishing connection failed, null

Using OH 2.4 and a MDT KNX IP interface here.

Bridge knx:ip:bridge "IP Interface"
[
    type="TUNNEL",
    ipAddress="192.168.0.53",
    portNumber=3671,
    readingPause=50,
    responseTimeout=10,
    readRetriesLimit=3,
    autoReconnectPeriod=60,
    localIp="192.168.0.40"
]

Restarting the binding helps.
Did you find a solution?

(Might be related: KNX2 stopps working after a while (received disconnect response status 0x21))

No, unfortunately not.

Another one, that has this problem too! I’m using openHAB 2.4.0. on my synology NAS (DS 213j).
OpenHAB looses the KNX connection and there are no log entries, that it tries to reconnect. The only thing that helps, is a restart.
My KNX bridge configuration:

Bridge knx:ip:bridge "KNX IP Gateway" [
    ipAddress="192.168.2.24",
    localIp="192.168.2.21",  //no fqdn here, as this would be resolved to 127.0.0.1
    type="TUNNEL", //or ROUTER
    portNumber=3671, 
    readingPause=50,
    responseTimeout=20,
    readRetriesLimit=5,
    autoReconnectPeriod=60,
    localSourceAddress="0.0.0"  // this physical address must be unused (even used by tunnel is not allowed!)
]

Log entry:

2019-03-13 07:17:02.397 [ERROR] [NXnet/IP Tunneling 192.168.2.24:3671] - close connection - maximum send attempts
tuwien.auto.calimero.KNXAckTimeoutException: maximum send attempts, no service acknowledgment received

OH is running on a Synology here too. I noticed that the connection to the KNX bridge is lost when I do a backup on the NAS which shuts down the database for a moment (JDBC binding complains).

So the question would be: Why are there no reconnect attempts?

Wild guess: Could it be that a KNX item that needs to be persisted receives an update and the JDBC binding is not able to reach the DB (timeout) which - as a consequence - somehow blocks the KNX binding?

And once again connection to KNX lost. I tried the hack mentioned above with a rule, that should restart the KNX binding if it’s state chnages to “UNINITIALIZED”, but that didn’t work ether: The KNX bridge remains in the state “Online”. It seems as if openHAB doesn’t really recognize, that the bridge is disconnected…

I once again had a lost connection without reconnect, but this time I had some additional log messages:

2019-03-21 21:49:31.958 [ERROR] [Xnet/IP Tunneling 192.168.10.15:3671] - establishing connection failed, null
...
2019-03-21 21:50:06.508 [ERROR] [Xnet/IP Tunneling 192.168.10.15:3671] - establishing connection failed, timeout connecting to control endpoint /192.168.10.15:3671
...
2019-03-21 21:50:56.774 [WARN ] [Xnet/IP Tunneling 192.168.10.15:3671] - received service acknowledgment with wrong send sequence 14, expected 15 - ignored
...
2019-03-21 21:51:14.265 [WARN ] [Xnet/IP Tunneling 192.168.10.15:3671] - received service acknowledgment with wrong send sequence 0, expected 1 - ignored
...
2019-03-21 21:51:32.500 [WARN ] [Xnet/IP Tunneling 192.168.10.15:3671] - received service acknowledgment with wrong send sequence 29, expected 30 - ignored
2019-03-21 21:51:38.625 [WARN ] [Xnet/IP Tunneling 192.168.10.15:3671] - received service acknowledgment with wrong send sequence 38, expected 39 - ignored
2019-03-21 21:51:43.761 [WARN ] [Xnet/IP Tunneling 192.168.10.15:3671] - received service acknowledgment with wrong send sequence 43, expected 44 - ignored
2019-03-21 21:53:19.551 [ERROR] [Xnet/IP Tunneling 192.168.10.15:3671] - establishing connection failed, null

At the same time this was happening, openHAB reported errors because it could not connect to a smtp server (which is running on the same machine as openHAB), which might be related or not:

2019-03-21 21:50:36.283 [WARN ] [me.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.eclipse.smarthome.io.monitor.internal.EventLogger@6fe1b607' takes more than 5000ms.
2019-03-21 21:50:05.952 [ERROR] [rg.openhab.action.mail.internal.Mail] - Could not send e-mail to 'openhab@domain.de'.
org.apache.commons.mail.EmailException: Sending the email to the following server failed : mail.domain.internal:25
	at org.apache.commons.mail.Email.sendMimeMessage(Email.java:1421) ~[228:org.openhab.action.mail:1.13.0]
	at org.apache.commons.mail.Email.send(Email.java:1448) ~[228:org.openhab.action.mail:1.13.0]
	at org.openhab.action.mail.internal.Mail.sendMail(Mail.java:157) [228:org.openhab.action.mail:1.13.0]
	at org.openhab.action.mail.internal.Mail.sendMail(Mail.java:89) [228:org.openhab.action.mail:1.13.0]
	at org.openhab.action.mail.internal.Mail.sendMail(Mail.java:67) [228:org.openhab.action.mail:1.13.0]
	at sun.reflect.GeneratedMethodAccessor252.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:?]
	at org.python.core.PyReflectedFunction.__call__(PyReflectedFunction.java:186) [?:?]
	at org.python.core.PyReflectedFunction.__call__(PyReflectedFunction.java:204) [?:?]
	at org.python.core.PyObject.__call__(PyObject.java:496) [?:?]
	at org.python.core.PyObject.__call__(PyObject.java:500) [?:?]
	at personal.mylib$py.notifyMe$17(/srv/openHAB2/conf/automation/lib/python/personal/mylib.py:172) [?:?]
	at personal.mylib$py.call_function(/srv/openHAB2/conf/automation/lib/python/personal/mylib.py) [?:?]
	at org.python.core.PyTableCode.call(PyTableCode.java:167) [?:?]
	at org.python.core.PyBaseCode.call(PyBaseCode.java:307) [?:?]
	at org.python.core.PyFunction.function___call__(PyFunction.java:471) [?:?]
	at org.python.core.PyFunction.__call__(PyFunction.java:466) [?:?]
	at org.python.pycode._pyx35.notifyInternalConnectionState$22(<script>:414) [?:?]
	at org.python.pycode._pyx35.call_function(<script>) [?:?]
	at org.python.core.PyTableCode.call(PyTableCode.java:167) [?:?]
	at org.python.core.PyBaseCode.call(PyBaseCode.java:170) [?:?]
	at org.python.core.PyFunction.__call__(PyFunction.java:434) [?:?]
	at org.python.pycode._pyx35.connectionLoxoneTorChanged$24(<script>:427) [?:?]
	at org.python.pycode._pyx35.call_function(<script>) [?:?]
	at org.python.core.PyTableCode.call(PyTableCode.java:167) [?:?]
	at org.python.core.PyBaseCode.call(PyBaseCode.java:138) [?:?]
	at org.python.core.PyFunction.__call__(PyFunction.java:413) [?:?]
	at core.rules$py.execute$4(/srv/openHAB2/conf/automation/lib/python/core/rules.py:39) [?:?]
	at core.rules$py.call_function(/srv/openHAB2/conf/automation/lib/python/core/rules.py) [?:?]
	at org.python.core.PyTableCode.call(PyTableCode.java:167) [?:?]
	at org.python.core.PyBaseCode.call(PyBaseCode.java:307) [?:?]
	at org.python.core.PyBaseCode.call(PyBaseCode.java:198) [?:?]
	at org.python.core.PyFunction.__call__(PyFunction.java:482) [?:?]
	at org.python.core.PyMethod.instancemethod___call__(PyMethod.java:237) [?:?]
	at org.python.core.PyMethod.__call__(PyMethod.java:228) [?:?]
	at org.python.core.PyMethod.__call__(PyMethod.java:218) [?:?]
	at org.python.core.PyMethod.__call__(PyMethod.java:213) [?:?]
	at org.python.core.PyObject._jcallexc(PyObject.java:3626) [?:?]
	at org.python.core.PyObject._jcall(PyObject.java:3658) [?:?]
	at org.python.proxies.core.rules$_FunctionRule$17.execute(Unknown Source) [?:?]
	at org.eclipse.smarthome.automation.module.script.rulesupport.shared.simple.SimpleRuleActionHandlerDelegate.execute(SimpleRuleActionHandlerDelegate.java:34) [217:org.eclipse.smarthome.automation.module.script.rulesupport:0.10.0.oh240]
	at org.eclipse.smarthome.automation.module.script.rulesupport.internal.delegates.SimpleActionHandlerDelegate.execute(SimpleActionHandlerDelegate.java:60) [217:org.eclipse.smarthome.automation.module.script.rulesupport:0.10.0.oh240]
	at org.eclipse.smarthome.automation.core.internal.RuleEngineImpl.executeActions(RuleEngineImpl.java:1199) [212:org.eclipse.smarthome.automation.core:0.10.0.oh240]
	at org.eclipse.smarthome.automation.core.internal.RuleEngineImpl.runRule(RuleEngineImpl.java:1004) [212:org.eclipse.smarthome.automation.core:0.10.0.oh240]
	at org.eclipse.smarthome.automation.core.internal.TriggerHandlerCallbackImpl$TriggerData.run(TriggerHandlerCallbackImpl.java:91) [212:org.eclipse.smarthome.automation.core:0.10.0.oh240]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
	at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: javax.mail.MessagingException: Exception reading response
	at com.sun.mail.smtp.SMTPTransport.readServerResponse(SMTPTransport.java:2153) ~[27:javax.mail:1.4.4]
	at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1912) ~[27:javax.mail:1.4.4]
	at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:638) ~[27:javax.mail:1.4.4]
	at javax.mail.Service.connect(Service.java:317) ~[27:javax.mail:1.4.4]
	at javax.mail.Service.connect(Service.java:176) ~[27:javax.mail:1.4.4]
	at javax.mail.Service.connect(Service.java:125) ~[27:javax.mail:1.4.4]
	at javax.mail.Transport.send0(Transport.java:194) ~[27:javax.mail:1.4.4]
	at javax.mail.Transport.send(Transport.java:124) ~[27:javax.mail:1.4.4]
	at org.apache.commons.mail.Email.sendMimeMessage(Email.java:1411) ~[?:?]
	... 50 more
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method) ~[?:?]
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:?]
	at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:?]
	at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:?]
	at com.sun.mail.util.TraceInputStream.read(TraceInputStream.java:110) ~[27:javax.mail:1.4.4]
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:?]
	at java.io.BufferedInputStream.read(BufferedInputStream.java:265) ~[?:?]
	at com.sun.mail.util.LineInputStream.readLine(LineInputStream.java:89) ~[27:javax.mail:1.4.4]
	at com.sun.mail.smtp.SMTPTransport.readServerResponse(SMTPTransport.java:2131) ~[27:javax.mail:1.4.4]
	at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:1912) ~[27:javax.mail:1.4.4]
	at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:638) ~[27:javax.mail:1.4.4]
	at javax.mail.Service.connect(Service.java:317) ~[27:javax.mail:1.4.4]
	at javax.mail.Service.connect(Service.java:176) ~[27:javax.mail:1.4.4]
	at javax.mail.Service.connect(Service.java:125) ~[27:javax.mail:1.4.4]
	at javax.mail.Transport.send0(Transport.java:194) ~[27:javax.mail:1.4.4]
	at javax.mail.Transport.send(Transport.java:124) ~[27:javax.mail:1.4.4]
	at org.apache.commons.mail.Email.sendMimeMessage(Email.java:1411) ~[?:?]
	... 50 more

The KNX connection was not resumed for several hours, after restarting openHAB (and no other service), everything was fine again.

Any progress here?

I’ve got excact the same problem… and its annoying.

I also tried running the latest beta of 2.5.0 and its the same problem. He does not try to reconnect.

Thanks,
Patrick

Sorry, no update from me on this problem. Unfortunatly (or better: luckily) the problem disappeared, even that I didn’t change anything… OpenHAB has been running stable for the last 4 months or so…

Did you guys report this on GitHub as a problem?
I think the developers will only look at the problem if they get a bug ticket on the GitHub project. They may not read all those community issues and filter down to the KNX topic …

Nope. Me not. As the problem is gone now, it didn’t invest anymore efforts on it…

Me neither… I dont have any github account or expierience :slight_smile: Anyone here arround could insert the bug ticket for me? :mage:

There is already an issue (I believe it’s not even the only one):

Ah, and I think the original reason is a short outage of the network connection (as the failing email sending shows), but the KNX binding fails to recover.
I experienced the same today… I have a telegram bot which is supposed to notify me when the KNX binding is down, but unfortunately it failed to send the message es well ^.

Hi all - my first comment on the community! I’ve been facing this same issue for a long time. Sometimes I don’t lose the connection in weeks, perhaps months, but many times it just takes a day or two. Seems it’s been getting worse over the last month or so. I’m using ABB 6136/APP-500 ControlTouch visualisation server as a tunnel to KNX bus. Based on this post there’s been similar issues with ABB devices.

openhab.log says:

2020-02-27 20:41:52.762 [WARN ] [Xnet/IP Tunneling 192.168.1.199:3671] - received service acknowledgment with wrong send sequence 237, expected 238 - ignored
2020-02-27 20:41:52.781 [ERROR] [Xnet/IP Tunneling 192.168.1.199:3671] - close connection - maximum send attempts
tuwien.auto.calimero.KNXAckTimeoutException: maximum send attempts, no service acknowledgment received
	at tuwien.auto.calimero.knxnetip.ConnectionBase.send(ConnectionBase.java:250) ~[?:?]
	at tuwien.auto.calimero.knxnetip.KNXnetIPTunnel.send(KNXnetIPTunnel.java:178) ~[?:?]
	at tuwien.auto.calimero.link.KNXNetworkLinkIP.onSend(KNXNetworkLinkIP.java:243) ~[?:?]
	at tuwien.auto.calimero.link.AbstractLink.send(AbstractLink.java:351) ~[?:?]
	at tuwien.auto.calimero.link.KNXNetworkLinkIP.sendRequestWait(KNXNetworkLinkIP.java:222) ~[?:?]
	at tuwien.auto.calimero.mgmt.TransportLayerImpl.connect(TransportLayerImpl.java:327) ~[?:?]
	at tuwien.auto.calimero.mgmt.ManagementClientImpl.send(ManagementClientImpl.java:796) ~[?:?]
	at tuwien.auto.calimero.mgmt.ManagementClientImpl.sendWait2(ManagementClientImpl.java:824) ~[?:?]
	at tuwien.auto.calimero.mgmt.ManagementClientImpl.readDeviceDesc(ManagementClientImpl.java:447) ~[?:?]
	at tuwien.auto.calimero.mgmt.ManagementProceduresImpl.isAddressOccupied(ManagementProceduresImpl.java:310) ~[?:?]
	at org.openhab.binding.knx.internal.client.AbstractKNXClient.isReachable(AbstractKNXClient.java:338) ~[?:?]
	at org.openhab.binding.knx.handler.AbstractKNXThingHandler.pollDeviceStatus(AbstractKNXThingHandler.java:144) ~[?:?]
	at org.openhab.binding.knx.handler.AbstractKNXThingHandler.lambda$1(AbstractKNXThingHandler.java:184) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
	at java.lang.Thread.run(Thread.java:748) [?:?]

so it seems similar to issues above you guys have.

My bridge settings are simple, I’ve removed all unnecessary lines after googling this issue, but this hasn’t helped:

Bridge knx:ip:bridge "KNX ControlTouch" @ "tekninen tila" [ 
    type = "TUNNEL", 
    ipAddress = "192.168.1.199", 
    autoReconnectPeriod = 30
]

I’ve got roughly 50 KNX devices on my house, currently 20 of them has been defined as things on knx.things file.

So far I haven’t found a solution for this. It seems I need to implement some kind of auto-restart rule for the KNX binding. I’ve seen some examples. I think the key is what is the best way to detect the connection close. What’s your proposal?

Hi Heikkij,

unfortunately I can’t say anything about ABB devices - I use a Weinzierl KNX IP Gateway and had the connection problem in my first weeks using openHAB, too. But now the problem is gone. I tweaked the settings of the KNX Bridge a little bit, but can’t say, if these tweaks were deciding.
Here is my bridge definition:

Bridge knx:ip:bridge "KNX IP Gateway" [
    ipAddress="192.168.2.24",
    localIp="192.168.2.20",  //no fqdn here, as this would be resolved to 127.0.0.1
    type="TUNNEL", //or ROUTER
    portNumber=3671, 
    readingPause=50,
    responseTimeout=20,
    readRetriesLimit=5,
    autoReconnectPeriod=60,
    localSourceAddress="0.0.0"  // this physical address must be unused (even used by tunnel is not allowed!)
]

Is there someone who has this problem in oh4 also?

I was on oh 3.4.3 long time, no problems with knx, even before with oh2 no problem.

Yesterday i upgraded my whole openhab server and now i´m on oh 4.1.1.

Now i get this error with knx binding, which i never had before.

I changed the bridge to router-mode, but there i got no connection, only tunnel-mode works. This is the same as in old oh versions, always only tunnel was working.

received service acknowledgment with wrong send sequence 217, expected 218 - ignored

After that i get following errors in log:

response timeout waiting for confirmation

no confirmation reply received for 0.0.0->1.1.255 L_Data.req, system priority hop count 6 repeat, tpdu 81

disconnected not gracefully (timeout)