Some rules stop working after a time

  • Platform information:

    • Hardware: Ryzen 5 3600 /16 GB RAM / 512 GB m.2
    • OS: Ubuntu 20.04.1 LTS & Docker
    • Java Runtime Environment: OpenJDK Runtime Environment 1.8.0_265
    • openHAB version: openHAB 2.5.9 Release Build
  • Issue of the topic: It’s a weird one. My config worked exceptionally well - since three days ago. Now after a time some of the rules just won’t execute any more. I’ve created a rule just for debugging, that sends a logInfo every minute - and that rule still works. Others simply don’t. When I restart the openHAB-Container, all works again. But it just stops after a while.

  • Things I’ve tried:

    • I deleted the tmp and cache folders.
    • I restarted the Container
    • I deactivated some rules for testing reasons

There is not much to see in the openhab.log. But there is a strange output for the CloudClient:

2020-11-10 05:12:46.544 [ERROR] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: {}
io.socket.engineio.client.EngineIOException: websocket error
at io.socket.engineio.client.Transport.onError(Transport.java:63) [bundleFile:?]
at io.socket.engineio.client.transports.WebSocket.access$400(WebSocket.java:24) [bundleFile:?]
at io.socket.engineio.client.transports.WebSocket$1$5.run(WebSocket.java:107) [bundleFile:?]
at io.socket.thread.EventThread$2.run(EventThread.java:80) [bundleFile:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
Caused by: javax.net.ssl.SSLException: Connection reset
at sun.security.ssl.Alert.createSSLException(Alert.java:127) ~[?:1.8.0_265]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:324) ~[?:1.8.0_265]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267) ~[?:1.8.0_265]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1357) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl.access$400(SSLSocketImpl.java:72) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:849) ~[?:1.8.0_265]
at okio.Okio$2.read(Okio.java:139) ~[?:?]
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ~[?:?]
at okio.RealBufferedSource.request(RealBufferedSource.java:67) ~[?:?]
at okio.RealBufferedSource.require(RealBufferedSource.java:60) ~[?:?]
at okio.RealBufferedSource.readByte(RealBufferedSource.java:73) ~[?:?]
at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:113) ~[?:?]
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:97) ~[?:?]
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:262) ~[?:?]
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:201) ~[?:?]
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) ~[?:?]
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
... 3 more
Suppressed: java.net.SocketException: Broken pipe (Write failed)
at java.net.SocketOutputStream.socketWrite0(Native Method) ~[?:1.8.0_265]
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111) ~[?:1.8.0_265]
at java.net.SocketOutputStream.write(SocketOutputStream.java:155) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketOutputRecord.encodeAlert(SSLSocketOutputRecord.java:81) ~[?:1.8.0_265]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:355) ~[?:1.8.0_265]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:267) ~[?:1.8.0_265]
at sun.security.ssl.TransportContext.fatal(TransportContext.java:262) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1357) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl.access$400(SSLSocketImpl.java:72) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:849) ~[?:1.8.0_265]
at okio.Okio$2.read(Okio.java:139) ~[?:?]
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ~[?:?]
at okio.RealBufferedSource.request(RealBufferedSource.java:67) ~[?:?]
at okio.RealBufferedSource.require(RealBufferedSource.java:60) ~[?:?]
at okio.RealBufferedSource.readByte(RealBufferedSource.java:73) ~[?:?]
at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:113) ~[?:?]
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:97) ~[?:?]at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:262) ~[?:?]
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:201) ~[?:?]
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) ~[?:?]
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:210) ~[?:1.8.0_265]
at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:467) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:461) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1146) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:72) ~[?:1.8.0_265]
at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:833) ~[?:1.8.0_265]
at okio.Okio$2.read(Okio.java:139) ~[?:?]
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237) ~[?:?]
at okio.RealBufferedSource.request(RealBufferedSource.java:67) ~[?:?]
at okio.RealBufferedSource.require(RealBufferedSource.java:60) ~[?:?]
at okio.RealBufferedSource.readByte(RealBufferedSource.java:73) ~[?:?]
at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:113) ~[?:?]
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:97) ~[?:?]
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:262) ~[?:?]
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:201) ~[?:?]
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141) ~[?:?]
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
... 3 more

But it successfully connected after that:

2020-11-10 05:12:46.554 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = ####, base URL = http://localhost:8080)
2020-11-10 05:12:48.549 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = ####, base URL = http://localhost:8080)

Don’t know if it is related to the issue.

Some hints for troubleshooting would be greatly appreciated!

Have a nice day and stay safe!

Nikolaus\

Start here: Why have my Rules stopped running? Why Thread::sleep is a bad idea

Hey Rich,

thanks for the reply!

Actually I’ve read your article some time ago and never used thread::sleep in my rules. For my light-rules I only use the createTimer(now.plusMinutes(timoutminutes) method, as suggested. :wink:

2020-11-10 18_39_47-docker@dockerserv_ ~

Greetings,

Nikolaus\

Thread::sleep is only one of the lines that can cause rules to run for an extended amount of time. executeCommandLine, and the sendHttpXrequest actions also have a timeout and can consume a thread for extended periods of time.

What is happening in your events.log? Are Items still changing and receiving commands?

Is it only Item triggered rules or do some cron triggered rules stop as well?

Is that screenshot taken while everything is working or while the rules have stopped working?

1 Like

Thanks @rlkoshak for the solution:

executeCommandLine, and the sendHttpXrequest actions also have a timeout and can consume a thread for extended periods of time.

And that, ladies and gentlemen, might be the hint I was looking for. I make a ton of HTTP-POST-Requests with curl to send commands to a digital signage system. I’ll look into it.

Edit: You were right. I was clever enough to define a group to set multiple switches - and every switch triggered a curl-command with a timeout. In hindsight that was not the smartest move.

1 Like