Time to give an update. I had a few “no response” issues during the last days, maybe every second or third day. And it did happen also with IPv6 being completely disabled on kernel level.
Looking into the logs, each issue seems to be caused by the jmdns responder trying to send a message to the multicast DNS address (224.0.0.251:5353 and ff02::00fb:5353, respectively). One among thousands or ten thousands similar/identical messages usually causing no problems. I’ll copy just a few selected lines from the log to show this:
2022-01-07 10:34:07.454 [DEBUG] [javax.jmdns.impl.tasks.Responder ] - Responder(schaltkiste-fritz-box.local.).run() JmDNS responding
2022-01-07 10:34:07.454 [DEBUG] [javax.jmdns.impl.tasks.Responder ] - Responder(schaltkiste-fritz-box.local.).run() JmDNS responding
2022-01-07 10:34:07.454 [DEBUG] [javax.jmdns.impl.DNSIncoming ] - DNSIncoming() questions:0 answers:1 authorities:0 additionals:0
2022-01-07 10:34:07.454 [DEBUG] [javax.jmdns.impl.DNSIncoming ] - DNSIncoming() questions:0 answers:3 authorities:0 additionals:0
2022-01-07 10:34:07.454 [TRACE] [javax.jmdns.impl.JmDNSImpl ] - send(schaltkiste-fritz-box.local.) JmDNS out:dns[response,224.0.0.251:5353, length=55, id=0x0, flags=0x8400:r:aa, answers=1
answers:
[IPv4Address@1589910590 type: TYPE_A index ...
2022-01-07 10:34:07.455 [TRACE] [javax.jmdns.impl.JmDNSImpl ] - send(schaltkiste-fritz-box.local.) JmDNS out:dns[response,224.0.0.251:5353, length=175, id=0x0, flags=0x8400:r:aa, answers=3
answers:
[Service@1687574434 type: TYPE_SRV index 33, ...
2022-01-07 10:34:07.455 [WARN ] [javax.jmdns.impl.tasks.Responder ] - Responder(schaltkiste-fritz-box.local.)run() exception
java.io.IOException: Das Netzwerk ist nicht erreichbar
at java.net.PlainDatagramSocketImpl.send(Native Method) ~[?:?]
at java.net.DatagramSocket.send(DatagramSocket.java:695) ~[?:?]
at javax.jmdns.impl.JmDNSImpl.send(JmDNSImpl.java:1665) ~[bundleFile:3.5.7]
at javax.jmdns.impl.tasks.Responder.run(Responder.java:154) [bundleFile:3.5.7]
at java.util.TimerThread.mainLoop(Timer.java:556) [?:?]
at java.util.TimerThread.run(Timer.java:506) [?:?]
...
2022-01-07 10:34:07.596 [DEBUG] [javax.jmdns.impl.JmDNSImpl ] - Cancelling JmDNS:
---- Local Host -----
local host info[schaltkiste-fritz-box.local., wlan0:192.168.178.87, DNS: schaltkiste-fritz-box.local. [schaltkiste.fritz.box/192.168.178.87] state: announced task: Renewer(schaltkiste-fritz-box.local.) state: announced]
---- Services -----
Service: openhab._hap._tcp.local.: [ServiceInfoImpl@1030685467 name: 'openHAB._hap._tcp.local.' address: 'schaltkiste.fritz.box/192.168.178.87:9123 null:9123 ' status: 'DNS: schaltkiste-fritz-box.local. [schaltkiste.fritz.box/192.168.178.87] state: announced task: Renewer(schaltkiste-fritz-box.local.) state: announced', has data
ff: 0
ci: 1
sh: N4sWPw==
sf: 0
md: openHAB
s#: 1
c#: 58
id: ba:63:f8:d5:da:b4]
...
The actual exception is thrown on the network transport level (java.net.PlainDatagramSocketImpl.send), so it most likely has nothing to do with the Homekit binding and seems to be caused by a temporary flaw of the network or network interface. However, the exception is caught by the reponder, which in turn requests the jmdns service to cancel.
From that point of time, the openHAB hap-service can no longer be discovered in the network. Typically, you can still access you accessories from the home app for a certain time, as the clients have cached their connection information to the bridge. Only after this information expires, they cannnot discover the service any more.
Other jmdns modules than the responder handle such network transport level exceptions more tolerantly. I have examples where, after a similar exception, the module catching the exception requests jmdns to recover the service, and everything is fine afterwards.
So my next step will be to test the modified responder, where exceptions also should lead to a service recovery.