Disconnected from the openHAB Cloud service

Does someone know the correct place to raise an issue?
Is this correct or just for the selfhosted version?

Same here, I cannot also login to the myopenhab.org

Hi guys, we are looking at the issue, i don’t see any obvious issues other then it seemed like our servers were not reachable for some amount of time, which disconnected everyone, then the service became reachable, so all the OH instances tried logging in all at once, and its slowed our service down to almost a halt. I’m restarting services just as a precaution since i’m not 100% of the root issue. It will take a little time for everyone to get connected again.

7 Likes

Same issue here … I already tried to remove the cloud connector and reinstalled again. Without success.

Now there is a downtime visible …

To answer some of the questions above:

  • If your log shows your server repeatedly trying to connect to myopenhab and you haven’t made any changes, you can assume the server is down.
  • Outages don’t always show up on status.openhab.org. Sometimes, the issues are in between the myopenhab server and individual OH servers. This means that the server is up, but we can’t reach it.
  • Anything that’s reliant upon myopenhab (Alexa, Google Assistant, mobile apps, etc.) will not work until myopenhab is back online.
  • The community is the correct place to report an outage.

Should I make changes to my server?

No. Do not make changes to your system to try and get it to reconnect, as that may cause you additional problems. Just leave it alone. If it was working before, it will eventually start working again.

How do I report an outage?

If you see a lot of disconnect/connect reports in your log and believe myopenhab is down, do the following:

  1. Look for a banner like this one at the top of every page in the community.

If you see the banner, there’s nothing more you need to do. Someone’s already trying to fix the problem.

If you don’t see a banner:

  1. Check the community to see if anyone else has reported the outage today.
  2. Check status.openhab.org.
  3. If no one else has posted, then start a new post and tag it as “Apps & Services | openHAB Cloud”.
  4. Tag @digitaldan in your post so that he’ll be notified.

If someone has already started a thread, please don’t start another one (we don’t want to spam Dan with notifications). And if Dan has already posted that he’s looking into it, please don’t ask when it will be back up. He’ll always let us know as soon as it’s working again.

There’s also no need to add “same thing for me” posts, since everyone will be affected. If you want to get notifications, you can change your monitoring status to “Watching”.

image

Adding more comments notifies everyone who previously posted, and they’ll keep relentlessly checking in the hope that the issue has been fixed. Let’s save ourselves the stress. :wink:

How do I reconnect?

After myopenhab is back up, it can sometimes take a few hours for everyone to reconnect (because we’re all trying to connect at the same time). Please be patient while myopenhab catches up.

I’ve personally found that my server will always reconnect on its own. Sometimes, people have to restart the cloud connector to get that to happen.

  1. In Main UI, click on “openHAB Cloud”
  2. Don’t change any settings. Just click the “Save” button at the top right. This will stop and restart the connector, and you’ll see it in your log.

There’s no need to restart the connector if your log shows that your server is still trying to connect on its own. If that’s the case, there’s still too much traffic overwhelming myopenhab.

8 Likes

sorry it was not my intention to spam Dan, I have seen the other threads but they were all quite dated so I thought I could open a new one

No worries. I think you were the first to report it this time, so perfectly reasonable to do so.

I’m mostly trying to head off the rising sense of urgency and mild panic that tends to come with these threads. As soon as Dan is aware of the outage and working on it, we just need to be patient. But that’s hard for users who haven’t experienced this and don’t know what’s going on.

The funny thing is that there have probably been other outages that some folks have never noticed, because we were sleeping/working/etc. It just feels like a major issue as soon as we’re aware of it.

I may need to make an edit to my post, as I should have first checked with Dan that it’s okay to tag him. He’s never minded me doing that before when an outage occurs, but you’re right that we don’t want to overwhelm him while he’s trying to sort out the issues.

1 Like

So what do we learn from this: After 1 failed reconnect (so if it’s not just a connection drop which can be fixed instantly) the cloud connector should back off using a random time so that the reconnects are not all at once. In case this happens again and the cloud connectors don’t automatically go into the back off because the cloud dropped all clients and immediately allowed reconnections you can just stop the service for a minute and then restart so everyone is connecting at a random time again. Something simple like between 5 and 60 seconds would probably already help a lot.

Or just rate-limit the amount of new connections per second on the server side, that should slow things down aswell (but might make DoS-Attacks easier).

And also what we learned: The reconnect fix which someone implemented is working :wink: I believe it has never happened before that there were so many reconnections and the server got so overwhelmed?

So i’m not 100% sure what the issue was that caused this, this but the service is up and running. I’m going to try and do some digging into what the root cause is, our provider has had network issues over the last 6 months that have caused similar issues where everyone disconnects, but the service has always recovered on its own. Strange.

1 Like

Hmm…for me it seems that my entire openhab instance has crashed first around 19 CET and then again sometime during the evening.

Could be a coincidence but it seems a bit of strange timing since the install has been running fine for years…

Edit: At 19:06 my OpenHab virtual machine seems to have frozen. At ~20:00 I noticed and it ran fine until 22:44 where it again seems to have died. I suspect out of lack of memory, it has however run all year round on that amount of memory for 2+ years. Maybe there could be some sort of memory issue connected to a fail in the cloud connector? Otherwise I’d like an award for least likely coincidence of 2022.

Working again here. :+1:

hi Folks, just a question around the cloud service…

I had the same issue yesterday and I was wondering, when the cloud service is not available, why is the app not using the direct IP-adress within my home network to access the sitemap?

Many thanks!

BR
Uwe

@Uwe_Samer: my problem was different from yours in the sense that both via app and via broswer on pc everything worked perfectly. what was wrong were the commands given by alexa / google home to those devices that openhab exposes to alexa / google home

(I am the one who tried to incorporate more error recovery logic to the connector)

Yeah this is promising indeed: the theory was that some cloud endpoint changes / issues surface as issues to the client, in such a way that client will remains offline even though cloud would be up. But I guess we have to just wait and see, perhaps only some particular types of errors are problematic, we have never have gotten bottom of the offline client issue.

With the latest openhab, having debug level logging to the cloud connector will reveal errors that were not logged before. Actually, from the logs we could see if the error recovery improvements “kick in” and if the old version would have gotten stuck as offline. You have to enable TRACE or DEBUG level logging for org.openhab.io.openhabcloud

@digitaldan we could also consider tuning the reconnection parameters such that it would be more graceful to the cloud?

That would be great, the initial websocket connection to our cloud service is an expensive operation, so when we have tens of thousands of OH’s all trying to connect at once, we almost DDOS the system.

Unfortunately I am seeing this or a similar behaviour again right now.

Even though I am seeing in the openHAB Logs an entry “Connected to the openHAB Cloud service…” from 2022-06-06 20:33:37.202 (that’s GMT+2) I am seeing “Offline” in myopenhab.org. I increased verbosity to TRACE now but I am not seeing anything in the logs, no pings, nothing. In netstat I am not seeing a connection to 172.104.246.157 though, so myopenhab.org is correct, I am in fact disconnected from the cloud.

Also right before the reconnection there was this:

2022-06-06 20:33:10.246 [ERROR] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance. Reconnecting. 
2022-06-06 20:33:37.105 [WARN ] [okhttp3.OkHttpClient                ] - A connection to https://myopenhab.org/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);
2022-06-06 20:33:37.202 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service....

Did we overwrite the old connection there so it lost it’s last reference there and that’s why it was considered as being leaked at that point in time? If that’s the case we should double-check that on a disconnect-event everything that should be closed is indeed closed. I assume it is not the new connection that is being leaked there as it would be weird that it leaks before a connection success is reported?

I just did a bundle:reload and for some reason it really thinks that it disconnected from the cloud at that point:

2022-06-07 01:10:28.234 [DEBUG] [o.openhabcloud.internal.CloudService] - openHAB Cloud connector deactivated
2022-06-07 01:10:28.235 [INFO ] [io.openhabcloud.internal.CloudClient] - Shutting down openHAB Cloud service connection
2022-06-07 01:10:28.258 [DEBUG] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: io client disconnect
2022-06-07 01:10:28.261 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service

I am not sure how that is possible though as there was definitely no connection at that point anymore.

Which version of the openhabcloud addon you are using? I would certainly expect to see socket.io ping/pong messages with the troubleshooting version… Note that ping/pong messages are not logged with release version.

In fact, we have other thread with users reporting successful ping/pong but myopenhab.org saying that the client is offline. This_might_ mean the online/offline tracking server side somehow gets mixed up.

There has been quite a lot of discussion here Openhab cloud connection stops working, does not try to reconnect - #44 by seanch

This is simply the way it is logged, I would not make too large conclusions from that.

Happens to me just now…

2022-10-19 09:32:29.533 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:33:03.352 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:33:32.136 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:34:30.267 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:34:42.430 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:34:58.911 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:35:26.995 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:38:49.026 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:39:18.246 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:39:55.081 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)
2022-10-19 09:40:03.553 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-747c-4eb4-a87a-xxxxxxxxxxxx, base URL = http://localhost:8080)

Status page is saying all is ok…

I’m using openHAB 2.4.0 Release Build.

Tried to restart CloudClient only but it didn’t help.

Reboot of openHab service solved an issue for me.