Myopenhab.org issues

HI all, after restarting one our cloud process, our redis cache is having issues serving data, i’m investigating this now. The service will be slow to unresponsive while i investigate.

I’m not 100% sure what the root cause of this issue was, Redis was pegged at 100% CPU and was non responsive even after a restart until i shut all our services down. My guess is we have bug that is DOSing our redis instance under certain restart conditions. I started all the processes again, things should start to come back up slowly over the next hour.

4 Likes

Apologies, but i needed to perform one last change, the service is coming back online and will be slow or not available for the next 30 mins.

4 Likes

Thank you for the informations!

I was already wondering if i have a local problem or is it cloud located.

I see a lot of these errors in my logfiles:

2020-09-11 08:56:08.211 [ERROR] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: {}

io.socket.engineio.client.EngineIOException: xhr poll error

	at io.socket.engineio.client.Transport.onError(Transport.java:63) [bundleFile:?]

	at io.socket.engineio.client.transports.PollingXHR.access$100(PollingXHR.java:26) [bundleFile:?]

	at io.socket.engineio.client.transports.PollingXHR$6$1.run(PollingXHR.java:140) [bundleFile:?]

	at io.socket.thread.EventThread$2.run(EventThread.java:80) [bundleFile:?]

	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]

	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]

	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]

Caused by: java.net.SocketTimeoutException: timeout

	at okio.Okio$4.newTimeoutException(Okio.java:230) ~[?:?]

	at okio.AsyncTimeout.exit(AsyncTimeout.java:285) ~[?:?]

	at okio.AsyncTimeout$2.read(AsyncTimeout.java:241) ~[?:?]

	at okio.RealBufferedSource.indexOf(RealBufferedSource.java:345) ~[?:?]

	at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:217) ~[?:?]

	at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:211) ~[?:?]

	at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189) ~[?:?]

	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:75) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[?:?]

	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[?:?]

	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[?:?]

	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) ~[?:?]

	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) ~[?:?]

	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]

	... 3 more

Caused by: java.net.SocketException: Socket closed

	at java.net.SocketInputStream.read(SocketInputStream.java:204) ~[?:1.8.0_222]

	at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_222]

	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) ~[?:1.8.0_222]

	at sun.security.ssl.InputRecord.read(InputRecord.java:503) ~[?:1.8.0_222]

	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975) ~[?:1.8.0_222]

	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933) ~[?:1.8.0_222]

	at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) ~[?:1.8.0_222]

	at okio.Okio$2.read(Okio.java:139) ~[?:?]

	at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ~[?:?]

	at okio.RealBufferedSource.indexOf(RealBufferedSource.java:345) ~[?:?]

	at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:217) ~[?:?]

	at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:211) ~[?:?]

	at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189) ~[?:?]

	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:75) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[?:?]

	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[?:?]

	at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[?:?]

	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[?:?]

	at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) ~[?:?]

	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) ~[?:?]

	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]

	... 3 more

When I open myopenhab.org in Chrome I get this:

Error: /opt/openhabcloud/views/index.ejs:2
1| <% section = 'index' %>
 >> 2| <%- include header.ejs %>
3|       <% if (!user) { %>
4| 
5| <section id="content">

/opt/openhabcloud/views/header.ejs:149
147|                         <% if (openhabstatus == 'online') { %>
148|                         <li class="openhab-online">
 >> 149|                             <a href="/events?source=openhab" title="Since <%= date_util(openhablastonline, timeZone).calendar()%>">Online</a>
150|                         </li>
151|                         <% } else { %>
152|                         <li class="openhab-offline">

Unknown Timezone: 'UTC'
at Function.tzset (/opt/openhabcloud/node_modules/time/index.js:149:15)
at Date.setTimezone (/opt/openhabcloud/node_modules/time/index.js:251:18)
at to_local (/opt/openhabcloud/date_util.js:22:17)
at to_moment (/opt/openhabcloud/date_util.js:35:30)
at Object.module.exports [as date_util] (/opt/openhabcloud/date_util.js:39:12)
at eval (eval at compile (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:481:12), <anonymous>:75:24)
at eval (eval at compile (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:481:12), <anonymous>:146:10)
at returnedFn (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:512:17)
at View.exports.renderFile [as engine] (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:364:31)
at View.render (/opt/openhabcloud/node_modules/express/lib/view.js:135:8)
at tryRender (/opt/openhabcloud/node_modules/express/lib/application.js:640:10)
at Function.render (/opt/openhabcloud/node_modules/express/lib/application.js:592:3)
at ServerResponse.render (/opt/openhabcloud/node_modules/express/lib/response.js:1012:7)
at exports.index (/opt/openhabcloud/routes/homepage.js:8:7)
at Layer.handle [as handle_request] (/opt/openhabcloud/node_modules/express/lib/router/layer.js:95:5)
at next (/opt/openhabcloud/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/opt/openhabcloud/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/opt/openhabcloud/node_modules/express/lib/router/layer.js:95:5)
at /opt/openhabcloud/node_modules/express/lib/router/index.js:281:22
at Function.process_params (/opt/openhabcloud/node_modules/express/lib/router/index.js:335:12)
at next (/opt/openhabcloud/node_modules/express/lib/router/index.js:275:10)
at SendStream.error (/opt/openhabcloud/node_modules/serve-static/index.js:121:7)

Maybe it will help you @digitaldan

What is alarming me is that I started getting issues connecting to openhab cloud aroun 11AM NZT and now some 8 hours later, I am seeing so much internal service degradation I tried to reboot my openhab server and although it comes back I still see issues connecting with the cloud and the local impact is making it unusable.
I really do not use it for much if anything so I am thinking of disconnecting at this point as I need local service to run better locally.

Of course it may all be a coincidence I will soon know I guess.

Regards
Paul

For some reason the service is not wanting to come back up, still investigating …

I have restarted all services, so the thundering heard of connections is reconnecting, this may take a while for the service to come back online.

now some 8 hours later,

Yes, if you look at the original post, we had an issue 8 hours ago and i just now made change related to this which has caused the impact we have right now.

and the local impact is making it unusable.

I’m not sure why that would be the case? The cloud connector may be trying to reconnect, but i would not think it would cause a noticeable degradation of your server.

It looks much better now… i did not get this disconnect errors since round about 20 minutes… but i still get sometimes this warn message:

2020-09-11 09:34:21.677 [WARN ] [okhttp3.OkHttpClient                ] - A connection to https://myopenhab.org/ was leaked. Did you forget to close a response body? To see where this was allocated, set the OkHttpClient logger level to FINE: Logger.getLogger(OkHttpClient.class.getName()).setLevel(Level.FINE);

The proxy connections are still not stable. I am inclined to let this sit for a bit and see if it clears.

The same here:

A connection to https://myopenhab.org/ was leaked. Did you forget to close a response body?

2020-09-11 09:52:40.244 [ERROR] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: {}
io.socket.engineio.client.EngineIOException: xhr poll error
at io.socket.engineio.client.Transport.onError(Transport.java:63) [bundleFile:?]
at io.socket.engineio.client.transports.PollingXHR.access$100(PollingXHR.java:26) [bundleFile:?]
at io.socket.engineio.client.transports.PollingXHR$6$1.run(PollingXHR.java:140) [bundleFile:?]
at io.socket.thread.EventThread$2.run(EventThread.java:80) [bundleFile:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_231]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_231]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
Caused by: java.io.IOException: 502
at io.socket.engineio.client.transports.PollingXHR$Request$1.onResponse(PollingXHR.java:234) ~[?:?]
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) ~[?:?]
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
… 3 more
2020-09-11 09:52:40.251 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxx, base URL = http://localhost:8080)

Last update for a little while. My connection just came back, so i expect this to continue to heal itself over the next 30 mins.

2 Likes

looks stable now! last error message in my logs are at 09:34:21 CET.

Since then no further error messages came up!
Thank you for fixing!

When i login @
https://myopenhab.org/login

I get a page with this

Error: /opt/openhabcloud/views/login.ejs:2
1| <% section = ‘login’ %>

2| <%- include header.ejs %>
3|
4|
5|

/opt/openhabcloud/views/header.ejs:154
152|


  • 153|

    154| Offline
    155|


  • 156| <% } %>
    157| <% } %>

    Unknown Timezone: ‘UTC’
    at Function.tzset (/opt/openhabcloud/node_modules/time/index.js:149:15)
    at Date.setTimezone (/opt/openhabcloud/node_modules/time/index.js:251:18)
    at to_local (/opt/openhabcloud/date_util.js:22:17)
    at to_moment (/opt/openhabcloud/date_util.js:35:30)
    at Object.module.exports [as date_util] (/opt/openhabcloud/date_util.js:39:12)
    at eval (eval at compile (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:481:12), :84:24)
    at eval (eval at compile (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:481:12), :146:10)
    at returnedFn (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:512:17)
    at View.exports.renderFile [as engine] (/opt/openhabcloud/node_modules/ejs/lib/ejs.js:364:31)
    at View.render (/opt/openhabcloud/node_modules/express/lib/view.js:135:8)
    at tryRender (/opt/openhabcloud/node_modules/express/lib/application.js:640:10)
    at Function.render (/opt/openhabcloud/node_modules/express/lib/application.js:592:3)
    at ServerResponse.render (/opt/openhabcloud/node_modules/express/lib/response.js:1012:7)
    at /opt/openhabcloud/routes/index.js:93:13
    at Layer.handle [as handle_request] (/opt/openhabcloud/node_modules/express/lib/router/layer.js:95:5)
    at next (/opt/openhabcloud/node_modules/express/lib/router/route.js:137:13)
    at Route.dispatch (/opt/openhabcloud/node_modules/express/lib/router/route.js:112:3)
    at Layer.handle [as handle_request] (/opt/openhabcloud/node_modules/express/lib/router/layer.js:95:5)
    at /opt/openhabcloud/node_modules/express/lib/router/index.js:281:22
    at Function.process_params (/opt/openhabcloud/node_modules/express/lib/router/index.js:335:12)
    at next (/opt/openhabcloud/node_modules/express/lib/router/index.js:275:10)
    at SendStream.error (/opt/openhabcloud/node_modules/serve-static/index.js:121:7)

    Same to me. Online again. Thanks!

    I have just restored backup as I realised removing openhabcloud stopped my echo devices kill form working DOH!

    Hopefully it will now come up straight this time.

    Thanks
    Paul