I am using OH3 version 3.3 on openhabian platform (RPi3).
Using only few bindings from original source.
I found recently that the system was not responsive and was very slow especially in the webpage display, iOS app, etc.
So I checked the CPU usage using top and htop and found that the RPi was using only 1 core and using it at 100% (even over 100% sometimes). The process using this CPU time is “java” with user “openhab”. But the core used is not always the same one. It changes over time.
Due to the high CPU usage I am not even able to access the karaf console.
I also noticed in the past that after a while the unit just freezes and stop responding at all.
So, I don’t want to reboot it as I want to find the root cause.
How can I track down what is causing this CPU usage?
How can I see how much CPU is using a specific binding?
Why is openhab using only 1 core?
Any help will be appreciated.
Now I am getting these error messages in the log:
2022-12-29 11:30:00.467 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@ca3cb0' takes more than 5000ms.
2022-12-29 11:30:18.026 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@ca3cb0' takes more than 5000ms.
2022-12-29 11:30:49.015 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@12c1a04' takes more than 5000ms.
2022-12-29 11:31:46.338 [INFO ] [ternal.dhcp.DHCPPacketListenerServer] - DHCP request packet listener online
2022-12-29 11:31:46.342 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.automation.internal.module.handler.GroupStateTriggerHandler@1c6836e' takes more than 5000ms.
2022-12-29 11:32:17.136 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@12c1a04' takes more than 5000ms.
2022-12-29 11:32:34.537 [WARN ] [WaveSerialHandler$ZWaveReceiveThread] - Exception during ZWave thread.
java.lang.IllegalStateException: Queue full
at java.util.AbstractQueue.add(AbstractQueue.java:98) ~[?:?]
at java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:326) ~[?:?]
at org.openhab.binding.zwave.internal.protocol.ZWaveTransactionManager.processReceiveMessage(ZWaveTransactionManager.java:447) ~[bundleFile:?]
at org.openhab.binding.zwave.internal.protocol.ZWaveController.incomingPacket(ZWaveController.java:1094) ~[bundleFile:?]
at org.openhab.binding.zwave.handler.ZWaveControllerHandler.incomingMessage(ZWaveControllerHandler.java:462) ~[bundleFile:?]
at org.openhab.binding.zwave.handler.ZWaveSerialHandler$ZWaveReceiveThread.run(ZWaveSerialHandler.java:384) [bundleFile:?]
2022-12-29 11:33:00.780 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.automation.internal.module.handler.GroupStateTriggerHandler@1b8b098' takes more than 5000ms.
Here more recent error messages:
2022-12-29 12:10:16.103 [INFO ] [openhab.event.ItemStateChangedEvent ] - [FAILED toString()]
==> /var/log/openhab/openhab.log <==
2022-12-29 12:02:28.228 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.util.concurrent.RejectedExecutionException: Queue capacity exceeded
at java.util.concurrent.ForkJoinPool$WorkQueue.growArray(ForkJoinPool.java:916) ~[?:?]
at java.util.concurrent.ForkJoinPool$WorkQueue.lockedPush(ForkJoinPool.java:871) ~[?:?]
at java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:1902) ~[?:?]
at java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:1921) ~[?:?]
at java.util.concurrent.ForkJoinPool.execute(ForkJoinPool.java:2453) ~[?:?]
at java.util.concurrent.CompletableFuture.asyncRunStage(CompletableFuture.java:1750) ~[?:?]
at java.util.concurrent.CompletableFuture.runAsync(CompletableFuture.java:1944) ~[?:?]
at com.qubular.openhab.binding.vicare.internal.VicareDeviceThingHandler.handleCommand(VicareDeviceThingHandler.java:293) ~[?:?]
at com.qubular.openhab.binding.vicare.internal.VicareBridgeHandler.lambda$featurePoller$0(VicareBridgeHandler.java:118) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
2022-12-29 12:11:17.546 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception:
java.lang.OutOfMemoryError: Java heap space
2022-12-29 12:11:21.932 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.metrics.ThingStateMetric@b3ee26' takes more than 5000ms.
2022-12-29 12:11:35.006 [INFO ] [ve.internal.protocol.ZWaveController] - Starting ZWave controller
2022-12-29 12:11:35.010 [INFO ] [ve.internal.protocol.ZWaveController] - ZWave timeout is set to 5000ms. Soft reset is false.
2022-12-29 12:12:05.493 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@12c1a04' takes more than 5000ms.
I had to reboot the RPi and magically the CPU load decreased, the system returned responsive, etc etc.
As it happens frequently (every month or so) I need to understand what is going on as this behaviour in a domotic system is dangerous.
Any one can help?
Something seems to be causing a memory leak. Once the memory is full many strange things happen. I use the systeminfo binding to monitor the memory allocation, both total memory and java heap space. You may be able to guess what’s the cause by removing some items or bindings until the memory leak does not happen anymore.
a memory leak should not happen in the certified bindings activated from inside OH, as they are checked by the developers, right?
So where the leak could happen?
Developers cannot test all possible user’s configuration. I’ve seen in the past several threads about memory leaks for some bindings.
There have been threads reporting problems with zwave (e.g. this one)
I understand, but my OH3 fails sometime after a month, sometimes after 3 months.
How can I check all the bindings with this timeframe?
after restart, I have
- usedHeap=60%-91% (working smoothly)
- available heap=30-120MB
- memory available=189MB
what should be the threshold for an alarm on the usedHeap?
The out of memory also could be collateral damage. If data is received but can’t be processed queues can increase in size and cause the system to run out of memory. At least that is what I’m reading in your logs. It can be binding locks the shared scheduler causing other bindings not be able to process data, or even the binding itself.
In your log I see an unknown binding
com.qubular.openhab.binding.vicare that also reports it’s queue is full. It could be this binding is blocked, either due to an other binding blocking or the binding blocking itself, while still receiving data and eventually causing it to run out of memory. (Or the queue running full is happening in the zwavel binding). A way to check if threads are blocked can be done in the karaf Console :
threads --monitors --locks
thanks for your reply.
Karaf was not accessible unfortunately.
When I tried to login Karaf, there was a timeout.