Opehab randomly freezes

Hey dear all,
i have a problem for quite some time now… I dont’t remember when this started exactly.
I have Openhabian running on a Pi4 with 4GB RAM. It randomly freezes and i have no clue, why it does. I have 4.1.0 running but this problem ocurred with previous versions, too.
Symptoms are: Everything seems to work properly, the UI is reachable and i see things online. However, i can not send commands via sitemap i.e.
Another thing is, that values are not uptated anymore, as you can see in the pics below. Something happened at around 17:50 and the items are frozen on this state.

After a restart with “sudo systemctl restart openhab.service”, everything works fine again.

Sorry for another pic, i am on a different computer right now and am not able to reach my logs (only frontend). Here you see the log at around 18:00.

Sometimes it happens after 250h uptime, sometimes after 20. Please help me, this is too annoying :wink:

TY!


Were you able to ping the rpi when it happened? In other words, have you ruled out network / connectivity issue?

I have an rpi4, not running openhab. Its only function is to boot up debian, and load a web browser to access a remote web page (openhab main ui, located on another server). It connects via Wifi. Occasionally its wireless interface would “die” and I’d have to reboot it. I don’t know how to bring the wireless interface back up on the rpi without rebooting. I wonder if you’re experiencing something similar.

I didn’t try this.
But as I said, I could reach the UI via browser. And also ssh worked, so I could reboot openhab.service.

I had similar issues (don’t remember exactly) because my system ran out of zram due to excessive logging of zigbee2mqtt. Then the system frequently switched zram to read-only.
So maybe check if your zram (e.g. /var/log) is writable after openhab stops.
The other things which should be checked are temperature problems (do you have a fan?) and a right sized power supply.

According to the log you showed, some events were recorded after 18.00, therefore events.log is still writeable. When zram becomes read-only this does not occur.
You can check by issuing

df -h

Is the “freeze” always related to the Miele web not responding? Perhaps an occasional network error is not managed correctly by some binding and causes some anomalous consumption of system resources.
I would look at the different memory channels of the system info binding to see if some of them show a steady increase prior to the problem. I have read somewhere that there was a problem with the number of threads being opened. There maybe a system info channel also for that.
Finally, have you tried with a different SD card? A failing SD gives unpredictable errors.

@Larsen
Hey, jea i do have a relatively beefy fan and even an UPS. Power supply is the official raspberry pi supply.

@Lionello_Marrelli
Hi :wink: I had another freeze tonight at around 3:17. Here is the log:

2024-02-01 03:17:42.009 [WARN ] [ernal.webservice.sse.SseStreamParser] - SSE connection failed unexpectedly: java.util.concurrent.TimeoutException: Idle timeout 30000 ms
2024-02-01 03:17:47.619 [WARN ] [io.openhabcloud.internal.CloudClient] - Error during communication: EngineIOException websocket error
2024-02-01 03:17:47.622 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: transport error
2024-02-01 03:17:47.623 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = 76...b1, base URL = http://localhost:8080)
2024-02-01 03:17:48.404 [WARN ] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: already connected. Reconnecting after 1359 ms.
2024-02-01 03:17:49.805 [WARN ] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: already connected. Reconnecting after 1359 ms.
2024-02-01 03:17:51.222 [WARN ] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: already connected. Reconnecting after 6741 ms.
==> /var/log/openhab/events.log <==
2024-02-01 03:17:57.856 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'mielecloud:dryer:Miele_Bridge_Cloud:000177757131' changed from ONLINE to OFFLINE (COMMUNICATION_ERROR): Dieses Miele-Gerät ist nicht mit dem Internet verbunden.
==> /var/log/openhab/openhab.log <==
2024-02-01 03:17:58.019 [WARN ] [io.openhabcloud.internal.CloudClient] - Error connecting to the openHAB Cloud instance: already connected. Reconnecting after 10466 ms.
==> /var/log/openhab/events.log <==
2024-02-01 03:17:58.750 [INFO ] [ab.event.ThingStatusInfoChangedEvent] - Thing 'mielecloud:dryer:Miele_Bridge_Cloud:000177757131' changed from OFFLINE (COMMUNICATION_ERROR): Dieses Miele-Gerät ist nicht mit dem Internet verbunden. to ONLINE
==> /var/log/openhab/openhab.log <==
2024-02-01 03:18:04.702 [INFO ] [control.internal.WebSocketConnection] - Web Socket close 1006. Reason: Disconnected
2024-02-01 03:18:04.702 [INFO ] [control.internal.WebSocketConnection] - Web Socket error
org.eclipse.jetty.io.EofException: null
	at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:280) ~[?:?]
	at org.eclipse.jetty.io.ssl.SslConnection.networkFlush(SslConnection.java:489) ~[?:?]
	at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.flush(SslConnection.java:1112) ~[?:?]
	at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) ~[?:?]
	at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277) ~[?:?]
	at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381) ~[?:?]
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.flush(FrameFlusher.java:264) ~[?:?]
	at org.eclipse.jetty.websocket.common.io.FrameFlusher.process(FrameFlusher.java:193) ~[?:?]
	at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:248) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:229) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.outgoingFrame(AbstractWebSocketConnection.java:581) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.websocket.client.io.WebSocketClientConnection.outgoingFrame(WebSocketClientConnection.java:58) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.close(AbstractWebSocketConnection.java:181) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:510) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:440) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:555) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:410) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:164) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) [bundleFile:9.4.52.v20230823]
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) [bundleFile:9.4.52.v20230823]
	at java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: java.io.IOException: Broken pipe
	at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:?]
	at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:66) ~[?:?]
	at sun.nio.ch.IOUtil.write(IOUtil.java:217) ~[?:?]
	at sun.nio.ch.IOUtil.write(IOUtil.java:153) ~[?:?]
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:563) ~[?:?]
	at java.nio.channels.SocketChannel.write(SocketChannel.java:642) ~[?:?]
	at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:274) ~[?:?]
	... 24 more
2024-02-01 03:18:08.534 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = 76...b1, base URL = http://localhost:8080)
2024-02-01 03:35:20.916 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 1 out of 3 failed when executing request (ModbusReadRequestBlueprint [slaveId=2, functionCode=READ_INPUT_REGISTERS, start=340, length=5, maxTries=3]). Will try again soon. Error was I/O error, so resetting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID 1e0c09a4-622f-45a9-a1cf-c8c0a0f5ba42]
2024-02-01 03:49:51.905 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 1 out of 3 failed when executing request (ModbusReadRequestBlueprint [slaveId=2, functionCode=READ_INPUT_REGISTERS, start=50, length=30, maxTries=3]). Will try again soon. Error was I/O error, so resetting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID 7defb4f4-6070-47c7-96de-0115a8bae782]
2024-02-01 04:09:39.870 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 1 out of 3 failed when executing request (ModbusReadRequestBlueprint [slaveId=2, functionCode=READ_INPUT_REGISTERS, start=340, length=5, maxTries=3]). Will try again soon. Error was I/O error, so resetting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID 59cdf16a-96d7-457b-b0ac-f94cdf61e8f4]

As i am a noob, i dont fully understand this. But does it say, it can’t reach the OH cloud? And it says at 3:17:57 that my Miele device (dryer) is not connected to the internet.
So maybe my Internet connection gets interrupted somehow and this leads to OH crash?!
EDIT: Okay wow… Indeed, my router says, he is connected to the internet scince 3:17. So this is definately related. But what can i do and why does OH crash?!

This i get after df -h

Filesystem      Size  Used Avail Use% Mounted on
/dev/root        59G  6.0G   50G  11% /
devtmpfs        1.9G     0  1.9G   0% /dev
tmpfs           1.9G     0  1.9G   0% /dev/shm
tmpfs           769M  2.0M  767M   1% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/mmcblk0p1  255M   51M  205M  20% /boot
/dev/zram1      721M  217M  453M  33% /opt/zram/zram1
overlay1        721M  217M  453M  33% /var/lib/openhab/persistence
/dev/zram2      974M   49M  858M   6% /opt/zram/zram2
overlay2        974M   49M  858M   6% /var/log
tmpfs           385M     0  385M   0% /run/user/1000

Your problem does not seem related to zram as the output of the command df -h tells, therefore it could be due to openhan running out of resources {memory or threads). Disabling one binding at a time may be a way to further investigate. Have you searched in the forum about issues with modbus binding?

Hi,
i have the same issue, Openhab seems to work fine but all my Modbus values were not update anymore, also my Homematic binding and components are not working anymore. You didn’t see anything wrong until you click on one diagram or something else and have linear Values.
I had it also with the 4.0 and 4.1.0 i updated now to newest Version and hope for the best.

Mine is running on ROckPi with 4GB Ram. Ram is just used to 45% System load is normally about 12%

BR Marcel

1 Like

@Lionello_Marrelli

Thanks for your advices. I will search the forums for Modbus issues. I am using modbus scince a few weeks now, maybe this is related indeed.

But did you see my edit with the Internet connection? It happens at the exact same time.