Web Socket Error on AmazonEchoControl binding every 65 seconds

That’s using a different binding - the Amazon Alexa Smart Home Skill.

I’ve got the same problem too - wife isn’t happy. WAF is now zero!

2 Likes

Same issues here.

same issue here. Stable 2.5.5.1

Andrea

I came to the same way and checked the solution on GitHub, but i think (maybe im wrong) it would be the wrong way to implement a tracker for outstanding requests. Because if there is a problem on the amazon server and the request “never” returns, it will never close the webSocketClient properly. It crashes while it tries to close the client but the question is, why the request never returns?!? So i think its better to find the request and check whats wrong and not to wait for the broken request to return. I also tried to set the idle timeout for the client itself to stop the request before the timer tries to close the webSocketClient, but it also ends in an exception.

To the memory test with the modified jar file and the removed timer which try to close the webSocketClient every minute, the memory issue is not present anymore since yesterday.

I‘m currently looking into this, but it‘s a bit difficult because I don‘t own any of these devices.

I think it is only necessary to add an amazon account to the binding and the issue appears.
You should take a look at:

org.openhab.binding.amazonechocontrol.internal.WebSocketConnection

The method:

initPongTimeoutTimer()

starts the timer to close the webSocketClient through the method:

close()

in the same class.

Thanks for that hint. That helped a lot. What happens is:

  • binding tries connect and immediately schedules a pong timeout
  • after successfull connect, the pong is acknowledged and a ping is scheduled
  • when the ping is send, a new pong timeout is scheduled (and acknowleged after the response is received)
  • when pong times out, the connection is closed (this fails, if no connection was established)
  • the binding tried to reconnect and we start from the beginning

The problem is that - at least for me - the connection is never established and so this happens every 60s. close on the connection is called twice (by the pong timeout and a second time by the handler), therefore it tries to stop the websocket connection twice, but the connect future is never terminated. This leads to memory and thread leaks.

I have fixed that and now check that the client is only stopped when it was started before, that the connection is only closed once and the connect future is cancelled. This version is neither leaking memory nor threads on my system. Please check if the functionality is still there.

<link removed>

You can directly update from the karaf console, you need to get your bundle number with bundle:list | grep -i amazon which should result in

259 │ Active │  80 │ 2.5.6.202006141340      │ openHAB Add-ons :: Bundles :: Amazon Echo Control Binding

259 is the bundle number on my system (yours will probably be different) and you can update with

bundle:update 259 <link removed>

Please report back if this fixes your issues.

Edit: Removed link, outdated, see below

13 Likes

Perfect!

Can you tell me what exactly you changed to fix it? Did you a check on the webSocketClient before closing it? Because my curren version has a lot of changes for multi device announcements (too many request fix aso) and some fixes for an issue in volume tts settings. I dont want to miss these fixes and also want to publish them in another thread. I didnt for now because i didnt know if it is ok for the publisher to release a version out of the release chain (git pull request aso).

1 Like

Open a PR, it‘s open source. And everybody is encouraged to do changes. If some more confirm that their issues are fixed, I‘ll open a PR and get this into snapshots as soon as possible. Maybe we can get it into 2.5.6

FYI, I just install the jar file and still getting the web socket error.

Ok, ill try to do so. I never worked on a public project, only on projects at work. (own git) We can pull and push but don’t have to do a PR.

This is the stuff i want to merge:

And this was my explanation:

Yes, that‘s expected if you cancel a running request. But you should no longer see the OOM issue.

Edit: the main problem still is that the websocket connection fails. I don‘t know if this is a problem on Amazon‘s end or something we do wrong. I merely fixed the memory and thread leaks when closing the connection. One could think of reducing the log-message to debug, as the binding tries to reconnect anyway.

I think Jan solved the heap memory problem. I guess the web-socket-message still appears (might be a problem at Amazon side) but it seems the heap memory isn’t increasing anymore (which will cause the system to halt with “out of memory error”)

Thanks for your work. :+1:

I can filter out the logs for now. Would it make a difference by editing my config to eliminate some of the control features and see if the web socket error goes away?

Any suggestion how to filter this out?

thanks again
Andrea

log:set off org.openhab.binding.amazonechocontrol

thank you :slight_smile:

btw any connection between this issue and the issue we are facing from a while in this binding?

2020-06-10 19:28:50.753 [WARN ] [trol.internal.handler.AccountHandler] - handling of websockets fails
org.openhab.binding.amazonechocontrol.internal.HttpException: GET url 'https://alexa.amazon.it/api/activities?startTime=1591810130510&size=10&offset=1' failed: Bad Request
	at org.openhab.binding.amazonechocontrol.internal.Connection.makeRequest(Connection.java:644) ~[bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.Connection.makeRequestAndReturnString(Connection.java:499) ~[bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.Connection.makeRequestAndReturnString(Connection.java:494) ~[bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.Connection.getActivities(Connection.java:938) ~[bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.handler.AccountHandler.handlePushActivity(AccountHandler.java:773) ~[bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.handler.AccountHandler.handleWebsocketCommand(AccountHandler.java:718) ~[bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.handler.AccountHandler.webSocketCommandReceived(AccountHandler.java:706) [bundleFile:?]
	at org.openhab.binding.amazonechocontrol.internal.WebSocketConnection$Listener.onWebSocketBinary(WebSocketConnection.java:404) [bundleFile:?]
	at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onBinaryMessage(JettyListenerEventDriver.java:103) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.message.SimpleBinaryMessage.messageComplete(SimpleBinaryMessage.java:68) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onBinaryFrame(JettyListenerEventDriver.java:74) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:145) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:321) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:202) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:226) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.Parser.parseSingleFrame(Parser.java:262) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:582) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:441) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:428) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:426) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:320) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:158) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:367) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782) [bundleFile:9.4.20.v20190813]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918) [bundleFile:9.4.20.v20190813]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]

OFF is probably a bit too much. Setting the Loglevel to WARN should be enough.

1 Like

Maybe. It could be that something changed on Amazon‘s side that lets the request fail (as in your log) and also prevents a successful connection (which results in the OOM here).

It would be very interesting if the lastVoiceCommand feature still works, because it seems the binding is not receiving it anymore. Maybe this has something to do with the “request fail”.