Muplex
(Muplex)
September 22, 2025, 7:55pm
1
Hello,
I’m running openhab 5 on openhabian on a new rpi5.
Is the Second time in 2 days that openhab Stop working completely. I can connect the uis and olso sitemap, but no item tespond
Here is the log.
‘19:43.452[WARN] [org.openhab.core.internal.events.EventHandler] - Dispatching event to subscriber ‘org.openhab.core.io.monitor.internal.EventLogger@6863fd22’ takes more than 5000ms.
19:03:44.351[INFO] [org.openhab.core.io.websocket.log.LogWebSocket] - WebSocket error: java.util.concurrent.TimeoutException: Idle timeout expired: 10000/10000 ms
19:06:19.003[WARN] [org.openhab.core.internal.events.EventHandler] - The queue for a subscriber of type ‘class org.openhab.core.io.monitor.internal.EventLogger’ exceeds 5000 elements. System may be unstable.
19:07:51.681[INFO] [org.openhab.core.io.websocket.log.LogWebSocket] - WebSocket error:
21:47:19.348[WARN] [org.openhab.core.io.rest.sse.internal.SseItemStatesEventBuilder] - Attempting to send a state update of an item which doesn’t exist: Widget_Settings_Bayesian_Sensor_Details
21:47:21.678[WARN] [org.openhab.core.io.rest.sse.internal.SseItemStatesEventBuilder] - Attempting to send a state update of an item which doesn’t exist: Widget_Settings_Bayesian_Sensor_Details
21:47:22.555[WARN] [org.openhab.core.io.rest.sse.internal.SseItemStatesEventBuilder] - Attempting to send a state update of an item which doesn’t exist: Widget_Settings_Bayesian_Sensor_Details’
rlkoshak
(Rich Koshak)
September 23, 2025, 1:23pm
2
The “attempting to send a state update to an item which doesn’t exist” comes from MainUI having an Item referenced in a widget on the screen that doesn’t exist. That’s not related to the other errors.
As for the other errors, I’m not sure what could be the cause for these errors. Maybe one of the thread deadlocks that have been discovered recently are causing problems.
You can run ttopon the karaf console to see if youj have lots of threads stuck in a blocked or waiting state. See High CPU Usage since upgrade to 5.0.1 - #4 by rlkoshak . Beyond that I’d start to disable add-ons until the problem goes away to help pinpoint where the problem lies.
Muplex
(Muplex)
September 23, 2025, 1:28pm
3
Thank you,
sometime problem seek is frustrating but I will test the TTOP bash in the console and let you know.
Thank for your hint!
I tried out the hint - but I do not get anywhere with that.
ttop points to the thread OH-OSGiEventManager:
Thread 464 OH-OSGiEventManager BLOCKED
Stacktrace:
org.openhab.core.io.websocket.log.LogWebSocket.sendMessage line: 228
org.openhab.core.io.websocket.log.LogWebSocket.logged line: 258
org.ops4j.pax.logging.spi.support.LogReaderServiceImpl.fire line: 105
org.ops4j.pax.logging.spi.support.LogReaderServiceImpl.fireEvent line: 83
org.ops4j.pax.logging.log4j2.internal.PaxLoggingServiceImpl.handleEvents line: 414
org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.doLog0 line: 1134
org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.doLog line: 1099
org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl.warn line: 465
org.ops4j.pax.logging.internal.TrackingLogger.warn line: 267
org.ops4j.pax.logging.slf4j.Slf4jLogger.warn line: 937
org.openhab.core.internal.events.EventHandler.dispatchEvent line: 160
org.openhab.core.internal.events.EventHandler.handleEvent line: 122
org.openhab.core.internal.events.EventHandler.handleEvent line: 95
org.openhab.core.internal.events.ThreadedEventHandler.lambda$0 line: 66
org.openhab.core.internal.events.ThreadedEventHandler$$Lambda/0x0000000100ae5fc8.run line: -1
java.lang.Thread.runWith line: -1
java.lang.Thread.run line: -1
all threads are blocked by this one:
Thread 34241 OH-LogWebSocket-24 WAITING
Stacktrace:
jdk.internal.misc.Unsafe.park line: -2
java.util.concurrent.locks.LockSupport.park line: -1
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block line: -1
java.util.concurrent.ForkJoinPool.unmanagedBlock line: -1
java.util.concurrent.ForkJoinPool.managedBlock line: -1
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await line: -1
org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block line: 241
org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block line: 89
org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite line: 110
org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString line: 400
org.openhab.core.io.websocket.log.LogWebSocket.sendMessage line: 232
org.openhab.core.io.websocket.log.LogWebSocket.flush line: 299
org.openhab.core.io.websocket.log.LogWebSocket$$Lambda/0x0000000101ef8000.run line: -1
java.util.concurrent.Executors$RunnableAdapter.call line: -1
java.util.concurrent.FutureTask.run line: -1
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run line: -1
java.util.concurrent.ThreadPoolExecutor.runWorker line: -1
java.util.concurrent.ThreadPoolExecutor$Worker.run line: -1
java.lang.Thread.runWith line: -1
java.lang.Thread.run line: -1
This does not help me to nail down to a rule or similar.
Anyone with an idea beside rebooting the system every few hours?
running openHAB 5.0.3 - Release Build on a Raspberry Pi 5 Model B Rev 1.0
with NVM SSD (512GB, with 13GB used)
Nadahar
(Nadar)
November 27, 2025, 4:11pm
5
There’s clearly a bug in LogWebSocket:
private synchronized void sendMessage(String message) throws IOException {
RemoteEndpoint remoteEndpoint = this.remoteEndpoint;
if (remoteEndpoint == null) {
return;
}
remoteEndpoint.sendString(message);
}
remoteEndpoint.sendString() is a blocking operation that won’t return until the recipient has received the message (which could be a browser that has been closed, for example), and the operation is done while holding a lock (ref. synchronized).
Any other thread that attempts to acquire this lock will end up waiting for the external recipient to receive the message, which might mean waiting for a timeout.
I would think this should bite many, my guess is that not that many leave the MainUI log viewer open, so that the impact has been limited so far?
A bug should be filed for this.
1 Like
Thanks for that find.
Does that mean I have to restart OH after using the LogViewer to be sure?
Nadahar
(Nadar)
November 27, 2025, 4:33pm
7
I can’t really say, but I would use the LogViewer very sparingly until the bug has been fixed.
Nadahar
(Nadar)
November 30, 2025, 5:51pm
8
I’ve created a GitHub issue to track this bug:
opened 05:49PM - 30 Nov 25 UTC
bug
This has been reported [on the forum](https://community.openhab.org/t/openhab-st… op-working/166412/4).
The stack trace is pretty clear in where the hang happens, and looking at the code in `LogWebSocket`, the code is "open" for this to happen.
```java
private synchronized void sendMessage(String message) throws IOException {
RemoteEndpoint remoteEndpoint = this.remoteEndpoint;
if (remoteEndpoint == null) {
return;
}
remoteEndpoint.sendString(message);
}
```
Fundamentally, it calls `remoteEndpoint.sendString()`, a blocking operation that will wait until the recipient processes the message or some timeout occurs, is invoked while holding an instance lock on `LogWebSocket`. This will lead to delays/bad throughput. In addition, `sendMessage()` is sometimes called while a lock on `deferredLogs` is held, which makes the two locks "intertwined" with deadlock potential.
It's clear from the user thread dump information that the `OH-OSGiEventManager` thread indeed becomes blocked by this.
Nadahar
(Nadar)
December 3, 2025, 2:56pm
9
I have also created what I believe to be a fix for the issue, but so far, it hasn’t gotten any attention/review:
main ← Nadahar:websocket-fixes
opened 01:17PM - 01 Dec 25 UTC
This is my attempt to fix #5162.
I've only redone the concurrency handling, t… hat is, the strategy of locks/synchronization/volatile in addition to some minor "nullness tweaks".
It's quite complicated to explain exactly why I came to the design I did, it's been a lot of back and forth in my mind regarding some aspects. I'm not saying that this is the only way to solve it, but I think this is *a* solution. The exact choices depend on some things that must be guesstimated when it comes to performance.
What I have done is basically to design concurrency handling from scratch in the classes that I've modified. I've added some comments to document the strategy. I've not tried to "fix" what was there, because the old strategy wasn't documented, so it's impossible for me to guess where things were missing or where the strategy was flawed. Generally, I can say that I've protected all mutable fields, not just some, and I've reduced the number of locks from two to one in `LogWebSocket`. The reason is that I think whatever performance benefit using two offered was very negligible, if any at all, while the chance for deadlocks increases exponentially when juggling several locks. To avoid deadlocks when using more than one lock, one must ensure that they are *always* locked in the same order, for *any potential path*, including during exceptions. This can be hard to do, requires a lot of dicipline, and can be hard to document so that any future modifications of the code is done accordingly.
Feel free to ask about any concrete design choice, and I'll try to explain as best as I can why I did what I did.
I've given it aroudn 16 hours of testing with a lot of logging enabled, without any hiccups, but as always, it's impossible to test all circumstances that the code will see "on the battlefield".