OpenHAB cloud Notifications not reliable, disconnecting frequentlly

Hi i run Uptime kuma for some time now and i monitor 2 instances of openhab locally and over the cloud.

Both locally run 99%+ of time, tho one is in remote cabin.

But i see that the cloud connection is droping, ok, but it takes long time to get back online, why is that?

So here is the first OH 2.5 at home on fiber optic that has 99.99% uptime, OH instance cloud connection has 33% uptime in 30 days, that to some cause may interfere as is an older OH version.

But why also OH the remote OH 4.2.0 has only 90% uptime, while it is on 4G, the modem has some 98% uptime usually. In other words my notifications arent workin and whenever i check it, the dang thing is offline.

I only care about it offline for notifications that i want to be reliiable for watering and security.

I can always reach the dashboard over VPN even when cloud connection is offline so long internet outages are ruled out here.

It may have to do with our ISPs that they glitch trafic here and there, that messes with cloud service, but how i debug it and try tricking it to go back online faster?
I did enable logging of the cloud service but its nothing speciall to be seen, it says offline and waits for a while, tries again and over time it gets online, but it takes too long.

I seen the trick of togling reset to the OH cloud service helped sometimes temporarially, it seems to clear some issues sometimes for few days, also after restart of OH it keeps connection better, but the fix?

Anyone has similar experience,
Solutions appriciated

OH2.5 on fiber

OH4 on 4G remotely

Cheers Matej

There is already another topic for the same problem open - maybe your issue there?

Hi
Thanks i cant recognize if the topic has rhe same cause, as in my case usually the remote access is also not working over the cloud, only via VPN it works.

I have a third instance i can check, its a bit newer 4.3.2, and it runs on same fiber internet at home as the 2.5.

I think it has to do with the connecrion droping and not getting back, i saw i clue in your thread @openhabgs that one mainaner said a zombie service is running in paralel, that may give issues, i didn’t know it can run like that? It sounds suspicious, that 2 services try connecting to same cloud with same id or am i interpreting it wrong?

But i see issues on diferent versions of OH thats wierd as i cant say to point at the version that is the cause. I may try out newest official as well to see whats the deal.

I don’t think these are same problem.

Sorry what are these graphs actually monitoring ? Is this a http check and if so to what URL exactly? How often is it checking?

Also posting DEBUG logs from the cloud connector addon would be helpful to understand if its going up and down.

Hi

I monitor this URL every 60s, with user and password of course.
https://home.myopenhab.org/basicui/app?sitemap=openhab_doma

The first check on OH cloud confirms that going up n down, i may be able to grab logs later, for start i snipped this

Thaks for response :grinning_face_with_big_eyes:

I ve set up “log harvesting” by

log:set DEBUG org.openhab.io.openhabcloud

I disabled some rule logs and as soon as i get disconections recorded i will post logs.

Great, that will help quite a bit i hope

Hi i got borred waiting for the active OH to go offline, but i got this from the OH4.2, that was offline already, on my remote Pi4 system when i enabled logging for cloud, the log repeats every 10s or so.

00:38:23.913 [DEBUG] [.io.openhabcloud.internal.CloudClient] - Socket.IO re-connecting (attempt 5606)
00:38:33.936 [DEBUG] [.io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: EngineIOException websocket error. Should reconnect automatically.
00:38:33.945 [DEBUG] [.io.openhabcloud.internal.CloudClient] - Socket.IO re-connect attempt error: SocketIOException Connection error

There should be more logs then that? It would be helpful to send many more entries so we see the pattern, its hard to tell from just three lines. If this connection is happening over and over again, i would suggest putting the binding into TRACE logging, and grab those logs (you can set it back to DEBUG after as it can be quite noisy). The more info i have the better i can help.

Hi, i got first solution to the remote raspberry with OH4.2, now online for a day.

Thanks, as you said TRACE, i remembered to do my manual trace to myopenhab as well :sweat_smile:, to test inside out.

The websocket error gave me a hint. Then i scrouged the forum to see similar events and the cause and stumbled on my old post OpenHAB Cloud connector constantly OFFLINE :sweat_smile:

I found the corporate was again DNS, intermitent Tailnet DNS, that got injected as main DNS at install instead of default nameserer on raspberry. With Tailscale as main DNS without backup DNS being configured, it obvioussly calls for trouble, its time to ask Tailscale about it.

I remembered Jeff Gerling saying it’s always the DNS…

I was unable to fix DNS manually on linux, it would get overwriten with TS on restart, so luckily i found in the Tailscale dashboard they have alternate DNS entry to select.

I am thinking i should disable the Tailslale DNS compeltely if it messes with me any more, i use IPs anyways.

The issue was so silent as remote VPN access worked, i thought internet conection is OK.

I will continue to monitor all instances to see whats the deal now, as i suspect they can still have underlaying cause to drop cloud connection.

The local OH2.5 was nutorious for being offline, while OH4.3 still wasnt spotless, keeping offline for over a day last week.

Thanks @digitaldan for ideas and support so far

Hi

It really apears it was Tailscale related issue spread over more systems connected to same Tailnet VPN.

That default Tailscale DNS 100.100.100.100 was the cause as it was intermitent, and it gets set by default at install, that overrides DNS set by router or fixed setup.

So i had to set backup DNS like 1.1.1.1 and such on Tailscale, its wierd it isnt there by default, dang it.

How are others not having that issue that use Tailscale, they sure didn’t poke around in DNS settings the first thing they install Tailscale?

I am probably not rare to use Tailscale with OpenHAB.

While the fastest solution i got was setting artenate DNS in Tailscale WEB UI, am not sure if its most reliable, to have config on WEB UI that guarantees reliable DNS.

Hope that helps others, and that OH cloud connection will hopefully be reliable now.