The time where every connect failed. Something must have changed and that is clearly not on openHAB side. But maybe if we find out what has changed, we can adapt to that.
Maybe Iâm wrong but we are all in CEST-timezone⊠So I guess most of us are using the german api-url:
right?
I refer to the web socket error, oom is only a severe side effect.
Iâm EST in the US
Right
But we need two changes then:
- Protect against the memory leak situation triggered by this or any similar scenario
- Identify specifically what is wrong on this Amazon integration and try to recover the functionality
@J-N-K After the call of âwebSocketClient.connectâ in the âWebSocketConnectionâ contructor the handler should call the method âonWebSocketConnectâ, but this never happens. So i think it stuck trying to establish the webSocket connection. âinitPongTimeoutTimerâ is also called in the constructor and tried to close after 60 sec. If the webSocketConnection is established everything worked fine and âinitPongTimeoutTimerâ was able to close it. It was necessary to cancel the âFutureâ to fix the memory issue, for sure. But it seems that this was only a very bad side effect.
Checked now when it starts:
2020-06-12 18:32:29.510 local German time. Means UTC +2.
Doesnt fix it
Exactly. This was clearly a programming error. But as you said: the root cause is that the connect fails. This is a bit surprising because In case of a failure onWebSocketError
should have been called, which isnât called either.
So for debugging we probably have to check if the amazon servers donât respond at all or if the response processing fails (before it reaches our own code). Even if encrypted connections donât produce readable content, using tcpdump or wireshark might give an insight on that.
Doesnât fix what? You still see OOM?
I have sporadic instances of that error every day of the month (pretty much except the 8th of june), and even earlier during MayâŠbut then more continuously on the 12th of June until I uninstalled the binding.
Why oracle time to ask amazon what they changed.
By the way⊠There is a user agent setting in code. Maybe this is the error cause amazon will not exept user agent.
The header user agent is not set for the webSocketConnection.
If it check this on firefox there is a header for user agent, for sure.
Yes, i still get the error every 65 seconds
2020-06-14 20:00:24.732 [INFO ] [nternal.WebSocketConnection$Listener] - Web Socket error
java.nio.channels.AsynchronousCloseException: null
at org.eclipse.jetty.client.http.HttpConnectionOverHTTP.close(HttpConnectionOverHTTP.java:181) ~[?:?]
at java.util.ArrayList.forEach(ArrayList.java:1257) [?:1.8.0_252]
at org.eclipse.jetty.client.AbstractConnectionPool.close(AbstractConnectionPool.java:208) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.client.DuplexConnectionPool.close(DuplexConnectionPool.java:237) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.client.HttpDestination.close(HttpDestination.java:385) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.client.HttpClient.doStop(HttpClient.java:260) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:180) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:201) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.websocket.client.WebSocketClient.doStop(WebSocketClient.java:371) [bundleFile:9.4.20.v20190813]
at org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:93) [bundleFile:9.4.20.v20190813]
at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection.close(WebSocketConnection.java:171) [bundleFile:?]
at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection$2.run(WebSocketConnection.java:200) [bundleFile:?]
at java.util.TimerThread.mainLoop(Timer.java:555) [?:1.8.0_252]
at java.util.TimerThread.run(Timer.java:505) [?:1.8.0_252]
This is not an out-of-memory error but an error message because of a failed connect. If openHAB crashes due to out-of memory, that is a severe problem.
Openhab was crashing due to OOM error. Confirmed it did so originally (assuming no longer with the patch?). Everything started failing after a certain point and needed a restart - some have even needed a restore from backup.
If this fix avoids the OOM while still showing the websocket failure, thatâd be a good improvement.
2020-06-12 16:43:28.276 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.OutOfMemoryError: Java heap space
...
...
at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection.<init>(WebSocketConnection.java:102) ~[?:?]
Edited as not sure if you refer to the situation after the patch, or originally
I installed the new version you provided. So far no more OOMs, but I am still looking carefully at my OH installation.