Multicast issues leading to severe network and server degradation

  • Platform information:

    • Hardware: armv7l, x86_64
    • OS: debian, armbian, raspberryos
    • Java Runtime Environment: zulu17.46+19, openjdk 17.0.9
    • openHAB version: 4.1.1
  • Issue of the topic:

    • Multicast advertisement of openHAB instance leads to avahi errors flooding the logs (/var/log/syslog) with avahi_normalize_name() failed messages.
    • I started to investigate this issue when I noticed serious network performance degradation (made worse by running multiple openHAB servers on my network) and even non-responsive systems due to the logs filling up the entire partition with multiple avahi error messages per second. Removing avahi from the systems resolved the logging issues, however I still noticed strange mDNS behavior, even with only one openHAB instance running.
    • My network has several vlans, but all openHAB servers run on the same network. Router / firewall that I use is OPNsense with unbound DNS, however this should not be relevant. I think I eliminated pretty much all other causes that could lead to this and I suspect others must have similar issues sooner or later as well. There are many other multicast (IoT) devices on the network that do not produce any of these issues and get IP addresses assigned just fine.
  • Logs:

    • Using the command avahi-browse --resolve --all it can be seen that initially the multicast advertisement can be resolved, however additional openHAB advertisements are made that cannot all be resolved anymore to an IP address anymore and result in timeouts. With multiple openHAB instances running a flood of timeouts and errors can be observed.
+ wlp170s0 IPv6 openhab                                       _openhab-server._tcp local
+ wlp170s0 IPv4 openhab                                       _openhab-server._tcp local
+ wlp170s0 IPv4 openhab-ssl                                   _openhab-server-ssl._tcp local
+ wlp170s0 IPv6 openhab-ssl                                   _openhab-server-ssl._tcp local
= wlp170s0 IPv4 openhab-ssl                                   _openhab-server-ssl._tcp local
   hostname = [guesthouse-sst.local]
   address = [192.168.1.48]
   port = [8443]
   txt = ["uri=/rest"]
= wlp170s0 IPv4 openhab                                       _openhab-server._tcp local
   hostname = [guesthouse-sst.local]
   address = [192.168.1.48]
   port = [8080]
   txt = ["uri=/rest"]
+ wlp170s0 IPv4 openhab-ssl (2)                               _openhab-server-ssl._tcp local
= wlp170s0 IPv4 openhab-ssl (2)                               _openhab-server-ssl._tcp local
   hostname = [guesthouse-sst.local]
   address = [192.168.1.48]
   port = [8443]
   txt = ["uri=/rest"]
+ wlp170s0 IPv6 openhab-ssl (2)                               _openhab-server-ssl._tcp local
+ wlp170s0 IPv6 openhab (2)                                   _openhab-server._tcp local
Failed to resolve service 'openhab' of type '_openhab-server._tcp' in domain 'local': Timeout reached
Failed to resolve service 'openhab-ssl' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached
Failed to resolve service 'openhab-ssl (2)' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached
Failed to resolve service 'openhab (2)' of type '_openhab-server._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab (3)                                   _openhab-server._tcp local
+ wlp170s0 IPv6 openhab (4)                                   _openhab-server._tcp local
Failed to resolve service 'openhab (3)' of type '_openhab-server._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab-ssl (3)                               _openhab-server-ssl._tcp local
+ wlp170s0 IPv6 openhab (5)                                   _openhab-server._tcp local
+ wlp170s0 IPv4 openhab (5)                                   _openhab-server._tcp local
= wlp170s0 IPv4 openhab (5)                                   _openhab-server._tcp local
   hostname = [guesthouse-sst.local]
   address = [192.168.1.48]
   port = [8080]
   txt = ["uri=/rest"]
Failed to resolve service 'openhab (4)' of type '_openhab-server._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab-ssl (4)                               _openhab-server-ssl._tcp local
Failed to resolve service 'openhab-ssl (3)' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab (6)                                   _openhab-server._tcp local
Failed to resolve service 'openhab (5)' of type '_openhab-server._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab-ssl (5)                               _openhab-server-ssl._tcp local
Failed to resolve service 'openhab-ssl (4)' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached
Failed to resolve service 'openhab (6)' of type '_openhab-server._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab-ssl (6)                               _openhab-server-ssl._tcp local
Failed to resolve service 'openhab-ssl (5)' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab (7)                                   _openhab-server._tcp local
+ wlp170s0 IPv6 openhab-ssl (7)                               _openhab-server-ssl._tcp local
+ wlp170s0 IPv4 openhab-ssl (7)                               _openhab-server-ssl._tcp local
= wlp170s0 IPv4 openhab-ssl (7)                               _openhab-server-ssl._tcp local
   hostname = [guesthouse-sst.local]
   address = [192.168.1.48]
   port = [8443]
   txt = ["uri=/rest"]
Failed to resolve service 'openhab-ssl (6)' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab (8)                                   _openhab-server._tcp local
Failed to resolve service 'openhab (7)' of type '_openhab-server._tcp' in domain 'local': Timeout reached
+ wlp170s0 IPv6 openhab-ssl (8)                               _openhab-server-ssl._tcp local
Failed to resolve service 'openhab-ssl (7)' of type '_openhab-server-ssl._tcp' in domain 'local': Timeout reached

and this goes on and on…

I suspect that this issue started somewhere in December after an update of openHAB, possibly after the introduction of the mDNS addon discovery feature but I am not sure of this as I did not test older versions of openHAB. I did test and confirm this behavior on 5 different servers running different debian variants and java 17 variants (openjdk / zulu).

I managed to get my systems under control by turning off openHAB’s mDNS, this can be done by placing a file called org.openhab.mdns.cfg in openhab-userdata/etc with the following content:

enabled=false

Again, I find it hard to imagine that others would not run into similar issues, so I posted this issue (and workaround by simply turning multicast off).

2 Likes

Same problem here. It’s causing avahi-daemon to spike cpu on my whole network, and clogging my wifi with useless multicast. It seems to get worse over time. Here’s a sample packet capture. With wireshark, those large 1400+ byte packets are decoded as multiple mdns queries with different sequence numbers, like the small packets, but bigger. 19:29:49.011000 IP 10.69.186.40.5353 > 224.0.0.251.5353: 0 [b2&3=0x200] [43a] - Pastebin.com

Your suggestion didn’t work for me, or I put the file in the wrong place. I am able to get it to calm down if I stop this bundle:

185 │ Resolved │  80 │ 4.1.1                  │ openHAB Core :: Bundles :: REST mDNS Announcer