Network binding behind two NAT routers not working

I have a network with two routers after each other.
The first router has LAN address 192.168.1.254 and is running a DHCP server.
The WAN side of the second router has ip 192.168.1.1 and the LAN is using 10.0.0.1
I wanted to setup a pingdevice to monitor the connection to the first router by pinging 192.168.1.254 from my NAS on the second LAN.
However I can not get this to work and the item just says UNDEF in latency.
The thing configuration looks like this:

Thing network:pingdevice:fibia_router [ hostname=ā€œ192.168.1.254ā€, retry=1, timeout=4000, refreshInterval=10000 ]

If I change the ip to an external ip like 8.8.8.8 or an internal ip like 10.0.0.1 it works fine so I guess the item configuration must be correct.

I tried enabling trace on the binding and it just says

2022-03-01 19:33:44.545 [TRACE] [g.network.internal.PresenceDetection] - Perform ARP ping presence detection for 192.168.1.254 on interface: ovs_eth1
2022-03-01 19:33:44.546 [TRACE] [g.network.internal.PresenceDetection] - Perform ARP ping presence detection for 192.168.1.254 on interface: ovs_eth0
2022-03-01 19:33:44.546 [TRACE] [g.network.internal.PresenceDetection] - Perform ARP ping presence detection for 192.168.1.254 on interface: docker0
2022-03-01 19:33:44.546 [TRACE] [g.network.internal.PresenceDetection] - Perform java ping presence detection for 192.168.1.254

I also tried setting ā€œbinding.network:allowSystemPings=falseā€ (is there any way I can confirm that the binding has read the configuration ?)

Is there anybody that know what the issues might be here ?

/Thanks

Can you ping from the command line where openHAB is running? If not then there is your problem. Something in the network is preventing pings. Nothing you can do in openHAB is going to fix that.

Iā€™m just guessing, but if both routers are assigning IP addresses then you probably have double NAT.

You may need to put your second router into bridge mode or forward ports.

Yes if I do a ping 192.168.1.254 from the NAS this works.
If I do ā€œsudo arping -w 2 -c 1 -I ovs_eth0 192.168.1.254ā€ this will not work.
But I guess this is not so surprising, because an address resolution (ARP broadcast) of the address 192.168.1.254 on the LAN 10.0.0.0 would not find the MAC address. But this will not work for any address outside the 10.0.0.0/255.255.255.0.
So I guess the Network binding somehow knows not to use arping or it tries ping after arping has failed.

Yes I do have double NAT but I do not see why this should prevent ping from working.

Like I said, Iā€™m just guessing. Since itā€™s known that double NAT can cause communication issues between computers, it seems worth testing. Best case is that it fixes your problem, worst case is that nothing changes and the issue is elsewhere.

Hello, this feels like a routing problem.
In those cases I usually use traceroute to identify where the packet stops.

I did a test with a clean OpenHAB install on my PC and configured the Network binding to ping 192.168.1.254 and this worked.
So I guess it might be something with the docker setup that I have.
I tried ā€œarping -i ovs_eth0 10.0.0.1ā€ inside the container and this works, but there is no ping command available so I do not know how to test ping. (anybody knows how to do that ?)
The container is created with ā€œā€“net=hostā€ so it should have full access the the network but not running with ā€œhigh privilegeā€.

My first idea was before seeing this, that docker adds another layer of ā€˜NATā€™ when not using ā€˜hostā€™ mode. But in host mode, it should work.

I donā€™t know what is contained in the openhab docker image, if it is alpine based then you could try installing ping with ā€˜apkā€™, otherwise maybe you could just copy ā€˜pingā€™ inside the container for testingā€¦

Thanks a lot. I copied the ping to the container and this is the output.

root@my-nas:/openhab/conf# ./ping 192.168.1.254
PING 192.168.1.254 (192.168.1.254) 56(84) bytes of data.
64 bytes from 192.168.1.254: icmp_seq=1 ttl=63 time=0.994 ms
64 bytes from 192.168.1.254: icmp_seq=2 ttl=63 time=0.896 ms

So ping from inside the container works.
Maybe the Java ping is working in a different way?
It would really be nice to know how the network binding figures out if it should use arping or ping.
But I can not find my way around in the source code.

As far as I understand the source code arping is enabled by default.
In case the target is not an IPv4 address or the arping tool cannot be found it will be disabled.

Are you looking in the performPresenceDetection method inside PresenceDetection.java ?
To me it looks like it starts multiply threads. Depending the arp tool it starts one or more arping threads and then a thread for ICMP ping. This also makes sense looking at the log in the first post.

So to see if it has something to do with Java ping, I figured I would try and make it use native ping instead. Looking in determinePingMethod() in NetworkUtils.java I can see that it tries to ping 127.0.0.1 with native ping to decide it it should use Java ping or native ping.

I can not understande why I can not get the network binding to use native ping now that I installed the ping command inside the docker container.
When I look at the ā€œthing propertiesā€ it says icmp_state = ā€œJava pingā€.
If I understande the source code correctly it should say ā€œSystem ping feature test failed. Using Java pingā€ if it had tried to use native ping and it should do that if ā€œbinding.network:allowSystemPings=trueā€
Any suggestions what I am missing ?
Is there anybody how is used docker but does not have icmp_state = ā€œJava pingā€ ?

Just to sum up.
I found that the reason the binding is not using native ping is because when it executes the command it returns error code 2. I guess this is because it does not have the required access because the docker container is not running with ā€œhigh privilegeā€. When the native ping fails the binding reverts to the java method isReachable that it calls ā€œJava pingā€, but this also does not have the privilege to execute a ICMP ping. The java documentation says that when isReachable does not have access it tries to use port 7 (Echo). After searching a bit on isReachable I found that many others found it to be unreliable. So I guess this is where the core issue is.
For all addresses on the LAN it will use arping that works unreliable. Therefore most user will not have issues.

2 Likes