Openhab 4.0.x rules need a lot of memory

Wikibear · August 13, 2023, 3:21pm

Thanks Guys, will give a try. If i lost the Denon Binding it was a little bit bad.

Wikibear · August 14, 2023, 4:00pm

I don’t touch the system after setting java opts to 768m. Looks a little bit more stable today at first. I saw some rules errors and i try to fix. The rule i try to fix goes to running after edit and stay there. Nothing happens… It needs 5-10min that Openhab maybe recover by itself. Check Syslog and get this:

Aug 14 17:55:07 openhabian karaf[4024]: Exception in thread "Timer-522" org.eclipse.jetty.websocket.api.WebSocketException: Session closed
Aug 14 17:55:15 openhabian karaf[4024]: #011at org.eclipse.jetty.websocket.common.WebSocketSession.outgoingFrame(WebSocketSession.java:350)
Aug 14 17:55:15 openhabian karaf[4024]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.uncheckedSendFrame(WebSocketRemoteEndpoint.java:322)
Aug 14 17:55:15 openhabian karaf[4024]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:109)
Aug 14 17:55:16 openhabian karaf[4024]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendBytes(WebSocketRemoteEndpoint.java:260)
Aug 14 17:55:16 openhabian karaf[4024]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.AlexaWebSocket.sendMessage(AlexaWebSocket.java:162)
Aug 14 17:55:16 openhabian karaf[4024]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.AlexaWebSocket.sendPing(AlexaWebSocket.java:175)
Aug 14 17:55:16 openhabian karaf[4024]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.WebSocketConnection$1.run(WebSocketConnection.java:186)
Aug 14 17:55:16 openhabian karaf[4024]: #011at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
Aug 14 17:55:16 openhabian karaf[4024]: #011at java.base/java.util.TimerThread.run(Timer.java:516)

wborn · August 14, 2023, 4:30pm

If you still have CPU issues with the fix, configure htop to show custom thread names so it is more clear what is consuming your CPU, see:

You can also do this with top as described in the FAQ:

openHAB 4 migration FAQ

Example

top:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
11216 openhab   20   0 1318172 472096   4528 S 256.2  47.4   4740:45 java

top -H -p 11216:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
25409 openhab   20   0 1318172 471856   4528 R  99.7  47.4  93:36.83 upnp-main-queue
30546 openhab   20   0 1318172 471856   4528 R  99.7  47.4  15:19.09 safeCall-queue

Wikibear · August 15, 2023, 10:17am

That’s strange… OH stuck i don’t get acces via http to basic ui and so many different process are open and CPU isn’t really high…

Edit: Wondering that there so many rules will be showed with htop but not running… But the running time of some rules confuses me really. If a rule runs 25-50 sec so what happens if you have 2,3 or more rules they are running in same time.

rlkoshak · August 15, 2023, 3:04pm

Each rule has it’s very own thread. That thread will only be shown as running when the rule is actively executing. If you have 2+ rules running at the same time, each thread should be shown running independently.

Wikibear · August 15, 2023, 3:47pm

OK this say htop and i don’t have access to UI. Website doesn’t open.

Before this happens:

VM Thread is gone from htop list and everything works fine… After 5 to 10 min…

But don’t have access to ui and get a blank white page.

And this comes up in my log:

2023-08-15 17:46:35.022 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:46:35.846 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@605d08' takes more than 5000ms.
2023-08-15 17:46:39.804 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@109b371' takes more than 5000ms.
2023-08-15 17:46:39.964 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: ping timeout
2023-08-15 17:46:40.462 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-15 17:46:41.529 [WARN ] [ore.internal.scheduler.SchedulerImpl] - Scheduled job '<unknown>' failed and stopped
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@511aa7[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@1ac6e7a[Wrapped task = org.openhab.core.automation.internal.TriggerHandlerCallbackImpl$TriggerData@19fcaf9]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@188e080[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2065) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:715) ~[?:?]
	at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:748) ~[?:?]
	at org.openhab.core.automation.internal.TriggerHandlerCallbackImpl.triggered(TriggerHandlerCallbackImpl.java:57) ~[?:?]
	at org.openhab.core.automation.internal.module.handler.GenericCronTriggerHandler.run(GenericCronTriggerHandler.java:91) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$0(CronSchedulerImpl.java:62) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$1(CronSchedulerImpl.java:69) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:189) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]
2023-08-15 17:46:51.245 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.168.179.65'. Next attempt in 60000ms
2023-08-15 17:46:57.154 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-15 17:46:59.461 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.168.179.65' with clientid OpenHAB
2023-08-15 17:47:06.172 [WARN ] [p.internal.http.HttpResponseListener] - Requesting 'https://www.luftfahrtclubbraunschweig.de/tools/data/taf.txt' (method='GET', content='null') failed: java.util.concurrent.TimeoutException: Total timeout 3000 ms elapsed
2023-08-15 17:47:22.777 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:47:25.334 [WARN ] [er.internal.dto.DwdWarningDataAccess] - Communication error occurred while getting data: java.util.concurrent.TimeoutException: Total timeout 5000 ms elapsed
2023-08-15 17:47:34.271 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:47:44.431 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@605d08' takes more than 5000ms.
2023-08-15 17:47:57.386 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:47:59.285 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@109b371' takes more than 5000ms.
2023-08-15 17:48:17.561 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@109b371' takes more than 5000ms.

After this UI is working again.

And in this time i got that in my syslog:

Aug 15 17:46:36 openhabian karaf[12065]: Exception in thread "Timer-442" org.eclipse.jetty.websocket.api.WebSocketException: Session closed
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketSession.outgoingFrame(WebSocketSession.java:350)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.uncheckedSendFrame(WebSocketRemoteEndpoint.java:322)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:109)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendBytes(WebSocketRemoteEndpoint.java:260)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.AlexaWebSocket.sendMessage(AlexaWebSocket.java:162)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.AlexaWebSocket.sendPing(AlexaWebSocket.java:175)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.WebSocketConnection$1.run(WebSocketConnection.java:186)
Aug 15 17:46:39 openhabian karaf[12065]: #011at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
Aug 15 17:46:39 openhabian karaf[12065]: #011at java.base/java.util.TimerThread.run(Timer.java:516)

Wikibear · August 17, 2023, 7:03pm

Upgraded to OH4.0.2-1:

VM Thread stay at second position and nothing is working. CPU is low. But I can’t access UI. That happens if I develop a blockly rule or at this moment where I had uninstalled a binding and try to add a new binding. That takes 5-10min and then everything is fine. So why blocked VM Thread other processes?

2023-08-17 21:08:33.167 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: ping timeout
2023-08-17 21:08:37.799 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.
2023-08-17 21:08:40.373 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-17 21:08:48.459 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.168.179.65'. Next attempt in 60000ms
2023-08-17 21:08:49.806 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.168.179.65' with clientid OpenHAB
2023-08-17 21:08:50.231 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-17 21:09:00.010 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.
2023-08-17 21:09:03.731 [WARN ] [p.internal.http.HttpResponseListener] - Requesting 'https://www.luftfahrtclubbraunschweig.de/tools/data/metar.txt' (method='GET', content='null') failed: java.util.concurrent.TimeoutException: Total timeout 3000 ms elapsed
2023-08-17 21:09:10.729 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@51a1c2' takes more than 5000ms.
2023-08-17 21:09:13.820 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.
2023-08-17 21:09:21.262 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@147b736' takes more than 5000ms.
2023-08-17 21:09:50.555 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.

rlkoshak · August 17, 2023, 8:05pm

One thing I notice in all your htop screenshots is that your memory usage is really high and, more importantly, your swap usage is pretty high as well. I wonder if you simply do not have enough resources.

I’m guessing this an an RPi 3? With the additional memory requirement for Java 17 maybe an RPi 3 is no longer sufficient to run openHAB.

All things considered, you shouldn’t be seeing any swap in use, let along almost half. When ever a process starts using swap, you can expect your machine to slow down to a crawl.

Andrew_Rowe · August 17, 2023, 10:43pm

That is what I’m thinking, the little Pi is just getting swamped
I have a RPi4 with 4GBs and I remember OH3 ran much slower on it then my little I3 PC. Once set up, it was sufficient for a small OH set up but you had to take your time setting it up. If you tried to do to much, to fast, it would lock up and you had to walk away. OH4 definitely requires more resources then OH3.

wborn · August 17, 2023, 10:57pm

Yes it is probably swapping all the time because your load is high while the CPU is not 100%. You could use iotop to see what it is actually doing.

Wikibear · August 18, 2023, 5:12am

I’m using a SSD. For me the pi is fast enough and it seems that web request via UI are faster now as just before. We don’t have an issue with speed. Only with Blockly rules execution when i develop a rule.

If i don’t touch the system, everything looks good.

Will check what iotop shows me.

On the other side the community must know than, that a raspberry pi 3 doesn’t work with OH4. I thing that’s really bad news for the one or other then.

wborn · August 18, 2023, 6:01am

Maybe it works better with a different memory config, no ZRAM and running fewer other applications. Eg. if you also run InfluxDB and Grafana on it try running these on another system.

Wikibear · August 18, 2023, 6:46am

I use a raspberry pi, cause it’s light weight and don’t need much power. If I must run something on another system the raspberry pi solution can die and I must buy new hardware. Maybe a rasp 4 will be an option then with 2GB RAM. But Raspis are expensive at the moment. That’s one reason to stay on 3. The other one is that the raspberry 3 is fine without hardware issue. I will only buy devices if there aren’t working and you can’t repair it easy. I think green here, sorry. Electricity is also expensive in Germany. Dunno the prices in other Country’s. But I think green and will do everything to have an eye to our environment.

Why ZRAM with SSD? I don’t use ZRAM.

Would be great to know which settings I can do better for memory settings.

Do you have an idea?

Thanks.

hmerk · August 18, 2023, 7:12am

Raspi 4 with 4Gig memory has dropped below 100€ again. Bought one at amazon a couple of weeks ago. Got it for around 80€……

8Gig version is just over 100€….

Lolodomo · August 18, 2023, 7:23am

OH4 with a RPI 3 works with no problem in my case

Wikibear · August 18, 2023, 7:26am

Yes that doesn’t make any sense to discuss that here. You get used old desktop devices with more power and same amount of money. So no worth to discuss about raspi 4 for 80€. Sorry… I start thinking about it, if its round about 40-50€. At the end, we discuss here to buy a new hardware, 'cause OH4 isn’t running stable. I think that must everybody know that a raspberry 3 isn’t usable for Openhab any more to get a stable system. And that must the guys outside of the world out know also BEFORE they upgrade to OH4 and the guys from OH must give support to this guys…

@Lolodomo nice to hear that. Are you using other things on your system only OH4?

I use Zigbee2MQTT, Grafana and InfluxDB. And everything was fine on startup. I see that 3-400 MB RAM is in use and then the ram increasing. And everything was fine after upgrade.

VM Thread comes up and after that nothing is working. So what is VM Thread doing?

I thing there is something wrong with settings or there is something wrong with OH or OH Bindings maybe.

wborn · August 18, 2023, 8:32am

I moved away from using an RPi3 because I ran into memory issues years ago while running OH2, Grafana, InfluxDB together. Running just OH2 would probably have worked fine. All the memory issues I had were mostly due to all the historic data I didn’t want to dispose and which was kept in memory caches by InfluxDB and Grafana.

Wikibear · August 18, 2023, 8:40am

All OOMs in past was a bug or an issue with binding or OH itself in the past. And everything was fine.
At the end, we discuss here hardware related problem? Or is that an issue or missconfig? Or Bug?

Oliver2 · August 18, 2023, 12:21pm

Did you do some lightweight debugging yourself to narrow down where the problem might come from?

for instance you could:

disable one binding after another
disable Zigbee2MQTT, Grafana and InfluxDB one after the other
check your rules which include a timer if all is set correctly and no timers going “mad”. You could disable all rules, restart OH, and enable one rule after the other

If you see no dramatic increase/decrease from one step to another (and allow a couple of minutes between each step) then most likely your total configuration exceeds your PI’s capabilities.

Wikibear · August 18, 2023, 12:41pm

Yes I can do, if I have more free time. Sorry but invest now after this update 30h or more for fixing.
Zigbee Grafana and Influx isn’t the problem, as you can see that this comes from OH don’t make any sens.

Why should be timer as problem? I never had before a problem with any timer. Is something change with timers that this can happen?

As i wrote before it’s running and something must be happend. Maybe i can check in CPU which last rule will started that I can check if this is a rule problem. Sometimes after some hours i get something like that:

2023-08-18 10:48:19.767 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@4f72c9' takes more than 5000ms.
2023-08-18 10:48:29.496 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1838cf6' takes more than 5000ms.
2023-08-18 10:48:29.580 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: ping timeout
2023-08-18 10:48:31.542 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-18 10:48:37.698 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-18 10:48:47.239 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/temp failed!
2023-08-18 10:48:47.239 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/outertemp failed!
2023-08-18 10:48:47.246 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/outerhum failed!
2023-08-18 10:48:47.267 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/wzhtr failed!
2023-08-18 10:48:47.264 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/hum failed!
2023-08-18 10:48:47.326 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/sumtemp failed!
2023-08-18 10:48:47.377 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/co2 failed!
2023-08-18 10:48:47.414 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/press failed!
2023-08-18 10:48:47.426 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/uv failed!
2023-08-18 10:48:47.445 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/uv failed!
2023-08-18 10:48:47.520 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/dfw failed!
2023-08-18 10:48:47.534 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/dfe failed!
2023-08-18 10:48:47.586 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/winddir failed!
2023-08-18 10:48:47.642 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/windspeed failed!
2023-08-18 10:48:48.900 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.168.179.65'. Next attempt in 60000ms
2023-08-18 10:48:50.906 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.168.179.65' with clientid OpenHAB

If there is no other solution to debug it directly, then I will do not more. I’m not interested to sit here hours for hours to find this issue if there is no other tool that support to find this issue, sorry. Then that’s my problem and I must live with it.

Maybe there is someone who has the same problems like me.