Web Socket Error on AmazonEchoControl binding every 65 seconds

The time where every connect failed. Something must have changed and that is clearly not on openHAB side. But maybe if we find out what has changed, we can adapt to that.

Maybe I’m wrong but we are all in CEST-timezone
 So I guess most of us are using the german api-url:

https://alexa.amazon.de/api

right?

I refer to the web socket error, oom is only a severe side effect.

1 Like

I’m EST in the US

So you are using alexa.amazon.com right?

Right

But we need two changes then:

  1. Protect against the memory leak situation triggered by this or any similar scenario
  2. Identify specifically what is wrong on this Amazon integration and try to recover the functionality
1 Like

@J-N-K After the call of “webSocketClient.connect” in the “WebSocketConnection” contructor the handler should call the method “onWebSocketConnect”, but this never happens. So i think it stuck trying to establish the webSocket connection. “initPongTimeoutTimer” is also called in the constructor and tried to close after 60 sec. If the webSocketConnection is established everything worked fine and “initPongTimeoutTimer” was able to close it. It was necessary to cancel the “Future” to fix the memory issue, for sure. But it seems that this was only a very bad side effect.

Checked now when it starts:
2020-06-12 18:32:29.510 local German time. Means UTC +2.

Doesnt fix it

Exactly. This was clearly a programming error. But as you said: the root cause is that the connect fails. This is a bit surprising because In case of a failure onWebSocketError should have been called, which isn’t called either.

So for debugging we probably have to check if the amazon servers don’t respond at all or if the response processing fails (before it reaches our own code). Even if encrypted connections don’t produce readable content, using tcpdump or wireshark might give an insight on that.

Doesn’t fix what? You still see OOM?

I have sporadic instances of that error every day of the month (pretty much except the 8th of june), and even earlier during May
but then more continuously on the 12th of June until I uninstalled the binding.

Why oracle time to ask amazon what they changed.

By the way
 There is a user agent setting in code. Maybe this is the error cause amazon will not exept user agent.

The header user agent is not set for the webSocketConnection.
If it check this on firefox there is a header for user agent, for sure.

Yes, i still get the error every 65 seconds

2020-06-14 20:00:24.732 [INFO ] [nternal.WebSocketConnection$Listener] - Web Socket error
java.nio.channels.AsynchronousCloseException: null
at org.eclipse.jetty.client.http.HttpConnectionOverHTTP.close(HttpConnectionOverHTTP.java:181) ~[?:?]
at java.util.ArrayList.forEach(ArrayList.java:1257) [?:1.8.0_252]
at org.eclipse.jetty.client.AbstractConnectionPool.close(AbstractConnectionPool.java:208) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.client.DuplexConnectionPool.close(DuplexConnectionPool.java:237) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.client.HttpDestination.close(HttpDestination.java:385) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.client.HttpClient.doStop(HttpClient.java:260) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.websocket.client.WebSocketClient.doStop(WebSocketClient.java:371) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93) [bundleFile:9.4.20.v20190813]
at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection.close(WebSocketConnection.java:171) [bundleFile:?]
at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection$2.run(WebSocketConnection.java:200) [bundleFile:?]
at java.util.TimerThread.mainLoop(Timer.java:555) [?:1.8.0_252]
at java.util.TimerThread.run(Timer.java:505) [?:1.8.0_252]

This is not an out-of-memory error but an error message because of a failed connect. If openHAB crashes due to out-of memory, that is a severe problem.

3 Likes

Openhab was crashing due to OOM error. Confirmed it did so originally (assuming no longer with the patch?). Everything started failing after a certain point and needed a restart - some have even needed a restore from backup.
If this fix avoids the OOM while still showing the websocket failure, that’d be a good improvement.

 2020-06-12 16:43:28.276 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
    java.lang.OutOfMemoryError: Java heap space
 ...
 ...
    at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection.<init>(WebSocketConnection.java:102) ~[?:?]

Edited as not sure if you refer to the situation after the patch, or originally

1 Like

I installed the new version you provided. So far no more OOMs, but I am still looking carefully at my OH installation.