Disconnected from the openHAB Cloud service

So what do we learn from this: After 1 failed reconnect (so if it’s not just a connection drop which can be fixed instantly) the cloud connector should back off using a random time so that the reconnects are not all at once. In case this happens again and the cloud connectors don’t automatically go into the back off because the cloud dropped all clients and immediately allowed reconnections you can just stop the service for a minute and then restart so everyone is connecting at a random time again. Something simple like between 5 and 60 seconds would probably already help a lot.

Or just rate-limit the amount of new connections per second on the server side, that should slow things down aswell (but might make DoS-Attacks easier).

And also what we learned: The reconnect fix which someone implemented is working :wink: I believe it has never happened before that there were so many reconnections and the server got so overwhelmed?

So i’m not 100% sure what the issue was that caused this, this but the service is up and running. I’m going to try and do some digging into what the root cause is, our provider has had network issues over the last 6 months that have caused similar issues where everyone disconnects, but the service has always recovered on its own. Strange.

1 Like

Hmm…for me it seems that my entire openhab instance has crashed first around 19 CET and then again sometime during the evening.

Could be a coincidence but it seems a bit of strange timing since the install has been running fine for years…

Edit: At 19:06 my OpenHab virtual machine seems to have frozen. At ~20:00 I noticed and it ran fine until 22:44 where it again seems to have died. I suspect out of lack of memory, it has however run all year round on that amount of memory for 2+ years. Maybe there could be some sort of memory issue connected to a fail in the cloud connector? Otherwise I’d like an award for least likely coincidence of 2022.

Working again here. :+1:

hi Folks, just a question around the cloud service…

I had the same issue yesterday and I was wondering, when the cloud service is not available, why is the app not using the direct IP-adress within my home network to access the sitemap?

Many thanks!

BR
Uwe

@Uwe_Samer: my problem was different from yours in the sense that both via app and via broswer on pc everything worked perfectly. what was wrong were the commands given by alexa / google home to those devices that openhab exposes to alexa / google home

(I am the one who tried to incorporate more error recovery logic to the connector)

Yeah this is promising indeed: the theory was that some cloud endpoint changes / issues surface as issues to the client, in such a way that client will remains offline even though cloud would be up. But I guess we have to just wait and see, perhaps only some particular types of errors are problematic, we have never have gotten bottom of the offline client issue.

With the latest openhab, having debug level logging to the cloud connector will reveal errors that were not logged before. Actually, from the logs we could see if the error recovery improvements “kick in” and if the old version would have gotten stuck as offline. You have to enable TRACE or DEBUG level logging for org.openhab.io.openhabcloud

@digitaldan we could also consider tuning the reconnection parameters such that it would be more graceful to the cloud?

That would be great, the initial websocket connection to our cloud service is an expensive operation, so when we have tens of thousands of OH’s all trying to connect at once, we almost DDOS the system.

Unfortunately I am seeing this or a similar behaviour again right now.

Even though I am seeing in the openHAB Logs an entry “Connected to the openHAB Cloud service…” from 2022-06-06 20:33:37.202 (that’s GMT+2) I am seeing “Offline” in myopenhab.org. I increased verbosity to TRACE now but I am not seeing anything in the logs, no pings, nothing. In netstat I am not seeing a connection to 172.104.246.157 though, so myopenhab.org is correct, I am in fact disconnected from the cloud.

Also right before the reconnection there was this:

2022-06-06 20:33:10.246 [ERROR] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance. Reconnecting. 
2022-06-06 20:33:37.105 [WARN ] [okhttp3.OkHttpClient                ] - A connection to https://myopenhab.org/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);
2022-06-06 20:33:37.202 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service....

Did we overwrite the old connection there so it lost it’s last reference there and that’s why it was considered as being leaked at that point in time? If that’s the case we should double-check that on a disconnect-event everything that should be closed is indeed closed. I assume it is not the new connection that is being leaked there as it would be weird that it leaks before a connection success is reported?

I just did a bundle:reload and for some reason it really thinks that it disconnected from the cloud at that point:

2022-06-07 01:10:28.234 [DEBUG] [o.openhabcloud.internal.CloudService] - openHAB Cloud connector deactivated
2022-06-07 01:10:28.235 [INFO ] [io.openhabcloud.internal.CloudClient] - Shutting down openHAB Cloud service connection
2022-06-07 01:10:28.258 [DEBUG] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: io client disconnect
2022-06-07 01:10:28.261 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service

I am not sure how that is possible though as there was definitely no connection at that point anymore.

Which version of the openhabcloud addon you are using? I would certainly expect to see socket.io ping/pong messages with the troubleshooting version… Note that ping/pong messages are not logged with release version.

In fact, we have other thread with users reporting successful ping/pong but myopenhab.org saying that the client is offline. This_might_ mean the online/offline tracking server side somehow gets mixed up.

There has been quite a lot of discussion here Openhab cloud connection stops working, does not try to reconnect - #44 by seanch

This is simply the way it is logged, I would not make too large conclusions from that.

Happens to me just now…

2022-10-19 09:32:29.533 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:33:03.352 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:33:32.136 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:34:30.267 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:34:42.430 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:34:58.911 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:35:26.995 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:38:49.026 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:39:18.246 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:39:55.081 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:40:03.553 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)

Status page is saying all is ok…

I’m using openHAB 2.4.0 Release Build.

Tried to restart CloudClient only but it didn’t help.

Reboot of openHab service solved an issue for me.

Happens all the time since at least 2 weeks…

2022-10-19 05:07:57.315 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:05:11.385 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:06:13.808 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:06:43.205 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:08:17.418 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:08:47.151 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:10:38.199 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:11:35.125 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:15:05.156 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:16:00.755 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:19:58.123 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:20:24.891 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx,base URL = http://localhost:8080)
2022-10-19 06:24:29.733 [ERROR] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance. Reconnecting.

Same here…

If the issues started 3 weeks ago (your post above is almost week old), it hints that issues have started before [WIP] updated disconnection logic and clean up unused noise by digitaldan · Pull Request #381 · openhab/openhab-cloud · GitHub was merged (9d ago)

Wonder if there was some glitch at the time since so many people have reported issues of these several connect/disconnect events, quite close apart

Based on the posts, restarting openhab fixes the issues. When the above fix is deployed cloud side (maybe it already has been?), it might help to avoid this in the future

My openHAB 3.2.0 installation faces the same issue since about a week or so. For me restarting the openHAB Cloud Connector bundle helps:

openhab> bundle:list|grep -i cloud
300 │ Active  │  80 │ 3.2.0                 │ openHAB Add-ons :: Bundles :: IO :: openHAB Cloud Connector
openhab> bundle:restart 300

It’s happened to me at least three times over the past few days

1 Like

For maintainers it might be interesting to have debug logs, perhaps telling why the client was disconnecter and reconnected

log:set TRACE org.openhab.io.openhabcloud

I am observing the same problem. As well, my log viewer disappears at the same time. I can connect to everything, but :9001 gives me “server is not replying”. Restart open hab is not fixing the issue. Reboot of the server does, but I don’t wan’t to restart my machine every day. Did someone facing the same issue?

1 Like

Same problem here since days Does anyone know about possible problems or changes server side? I didn’t change anything client side and this problem popped up some days ago. It disconnects multiple times a day and restarts solves only temporarily (both openhab service restart and cloud connector instance restart)

1 Like