Openhab 4.0.x rules need a lot of memory

Wikibear · August 15, 2023, 3:47pm

OK this say htop and i don’t have access to UI. Website doesn’t open.

Before this happens:

VM Thread is gone from htop list and everything works fine… After 5 to 10 min…

But don’t have access to ui and get a blank white page.

And this comes up in my log:

2023-08-15 17:46:35.022 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:46:35.846 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@605d08' takes more than 5000ms.
2023-08-15 17:46:39.804 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@109b371' takes more than 5000ms.
2023-08-15 17:46:39.964 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: ping timeout
2023-08-15 17:46:40.462 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-15 17:46:41.529 [WARN ] [ore.internal.scheduler.SchedulerImpl] - Scheduled job '<unknown>' failed and stopped
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@511aa7[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@1ac6e7a[Wrapped task = org.openhab.core.automation.internal.TriggerHandlerCallbackImpl$TriggerData@19fcaf9]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@188e080[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2065) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:833) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:715) ~[?:?]
	at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:748) ~[?:?]
	at org.openhab.core.automation.internal.TriggerHandlerCallbackImpl.triggered(TriggerHandlerCallbackImpl.java:57) ~[?:?]
	at org.openhab.core.automation.internal.module.handler.GenericCronTriggerHandler.run(GenericCronTriggerHandler.java:91) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$0(CronSchedulerImpl.java:62) ~[?:?]
	at org.openhab.core.internal.scheduler.CronSchedulerImpl.lambda$1(CronSchedulerImpl.java:69) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$12(SchedulerImpl.java:189) ~[?:?]
	at org.openhab.core.internal.scheduler.SchedulerImpl.lambda$1(SchedulerImpl.java:88) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
	at java.lang.Thread.run(Thread.java:833) ~[?:?]
2023-08-15 17:46:51.245 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.168.179.65'. Next attempt in 60000ms
2023-08-15 17:46:57.154 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-15 17:46:59.461 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.168.179.65' with clientid OpenHAB
2023-08-15 17:47:06.172 [WARN ] [p.internal.http.HttpResponseListener] - Requesting 'https://www.luftfahrtclubbraunschweig.de/tools/data/taf.txt' (method='GET', content='null') failed: java.util.concurrent.TimeoutException: Total timeout 3000 ms elapsed
2023-08-15 17:47:22.777 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:47:25.334 [WARN ] [er.internal.dto.DwdWarningDataAccess] - Communication error occurred while getting data: java.util.concurrent.TimeoutException: Total timeout 5000 ms elapsed
2023-08-15 17:47:34.271 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:47:44.431 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@605d08' takes more than 5000ms.
2023-08-15 17:47:57.386 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@18ffa70' takes more than 5000ms.
2023-08-15 17:47:59.285 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@109b371' takes more than 5000ms.
2023-08-15 17:48:17.561 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@109b371' takes more than 5000ms.

After this UI is working again.

And in this time i got that in my syslog:

Aug 15 17:46:36 openhabian karaf[12065]: Exception in thread "Timer-442" org.eclipse.jetty.websocket.api.WebSocketException: Session closed
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketSession.outgoingFrame(WebSocketSession.java:350)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.uncheckedSendFrame(WebSocketRemoteEndpoint.java:322)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:109)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendBytes(WebSocketRemoteEndpoint.java:260)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.AlexaWebSocket.sendMessage(AlexaWebSocket.java:162)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.AlexaWebSocket.sendPing(AlexaWebSocket.java:175)
Aug 15 17:46:39 openhabian karaf[12065]: #011at org.smarthomej.binding.amazonechocontrol.internal.websocket.WebSocketConnection$1.run(WebSocketConnection.java:186)
Aug 15 17:46:39 openhabian karaf[12065]: #011at java.base/java.util.TimerThread.mainLoop(Timer.java:566)
Aug 15 17:46:39 openhabian karaf[12065]: #011at java.base/java.util.TimerThread.run(Timer.java:516)

Wikibear · August 17, 2023, 7:03pm

Upgraded to OH4.0.2-1:

VM Thread stay at second position and nothing is working. CPU is low. But I can’t access UI. That happens if I develop a blockly rule or at this moment where I had uninstalled a binding and try to add a new binding. That takes 5-10min and then everything is fine. So why blocked VM Thread other processes?

2023-08-17 21:08:33.167 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: ping timeout
2023-08-17 21:08:37.799 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.
2023-08-17 21:08:40.373 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-17 21:08:48.459 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.168.179.65'. Next attempt in 60000ms
2023-08-17 21:08:49.806 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.168.179.65' with clientid OpenHAB
2023-08-17 21:08:50.231 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-17 21:09:00.010 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.
2023-08-17 21:09:03.731 [WARN ] [p.internal.http.HttpResponseListener] - Requesting 'https://www.luftfahrtclubbraunschweig.de/tools/data/metar.txt' (method='GET', content='null') failed: java.util.concurrent.TimeoutException: Total timeout 3000 ms elapsed
2023-08-17 21:09:10.729 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@51a1c2' takes more than 5000ms.
2023-08-17 21:09:13.820 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.
2023-08-17 21:09:21.262 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@147b736' takes more than 5000ms.
2023-08-17 21:09:50.555 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1ef6d96' takes more than 5000ms.

rlkoshak · August 17, 2023, 8:05pm

One thing I notice in all your htop screenshots is that your memory usage is really high and, more importantly, your swap usage is pretty high as well. I wonder if you simply do not have enough resources.

I’m guessing this an an RPi 3? With the additional memory requirement for Java 17 maybe an RPi 3 is no longer sufficient to run openHAB.

All things considered, you shouldn’t be seeing any swap in use, let along almost half. When ever a process starts using swap, you can expect your machine to slow down to a crawl.

Andrew_Rowe · August 17, 2023, 10:43pm

That is what I’m thinking, the little Pi is just getting swamped
I have a RPi4 with 4GBs and I remember OH3 ran much slower on it then my little I3 PC. Once set up, it was sufficient for a small OH set up but you had to take your time setting it up. If you tried to do to much, to fast, it would lock up and you had to walk away. OH4 definitely requires more resources then OH3.

wborn · August 17, 2023, 10:57pm

Yes it is probably swapping all the time because your load is high while the CPU is not 100%. You could use iotop to see what it is actually doing.

Wikibear · August 18, 2023, 5:12am

I’m using a SSD. For me the pi is fast enough and it seems that web request via UI are faster now as just before. We don’t have an issue with speed. Only with Blockly rules execution when i develop a rule.

If i don’t touch the system, everything looks good.

Will check what iotop shows me.

On the other side the community must know than, that a raspberry pi 3 doesn’t work with OH4. I thing that’s really bad news for the one or other then.

wborn · August 18, 2023, 6:01am

Maybe it works better with a different memory config, no ZRAM and running fewer other applications. Eg. if you also run InfluxDB and Grafana on it try running these on another system.

Wikibear · August 18, 2023, 6:46am

I use a raspberry pi, cause it’s light weight and don’t need much power. If I must run something on another system the raspberry pi solution can die and I must buy new hardware. Maybe a rasp 4 will be an option then with 2GB RAM. But Raspis are expensive at the moment. That’s one reason to stay on 3. The other one is that the raspberry 3 is fine without hardware issue. I will only buy devices if there aren’t working and you can’t repair it easy. I think green here, sorry. Electricity is also expensive in Germany. Dunno the prices in other Country’s. But I think green and will do everything to have an eye to our environment.

Why ZRAM with SSD? I don’t use ZRAM.

Would be great to know which settings I can do better for memory settings.

Do you have an idea?

Thanks.

hmerk · August 18, 2023, 7:12am

Raspi 4 with 4Gig memory has dropped below 100€ again. Bought one at amazon a couple of weeks ago. Got it for around 80€……

8Gig version is just over 100€….

Lolodomo · August 18, 2023, 7:23am

OH4 with a RPI 3 works with no problem in my case

Wikibear · August 18, 2023, 7:26am

Yes that doesn’t make any sense to discuss that here. You get used old desktop devices with more power and same amount of money. So no worth to discuss about raspi 4 for 80€. Sorry… I start thinking about it, if its round about 40-50€. At the end, we discuss here to buy a new hardware, 'cause OH4 isn’t running stable. I think that must everybody know that a raspberry 3 isn’t usable for Openhab any more to get a stable system. And that must the guys outside of the world out know also BEFORE they upgrade to OH4 and the guys from OH must give support to this guys…

@Lolodomo nice to hear that. Are you using other things on your system only OH4?

I use Zigbee2MQTT, Grafana and InfluxDB. And everything was fine on startup. I see that 3-400 MB RAM is in use and then the ram increasing. And everything was fine after upgrade.

VM Thread comes up and after that nothing is working. So what is VM Thread doing?

I thing there is something wrong with settings or there is something wrong with OH or OH Bindings maybe.

wborn · August 18, 2023, 8:32am

I moved away from using an RPi3 because I ran into memory issues years ago while running OH2, Grafana, InfluxDB together. Running just OH2 would probably have worked fine. All the memory issues I had were mostly due to all the historic data I didn’t want to dispose and which was kept in memory caches by InfluxDB and Grafana.

Wikibear · August 18, 2023, 8:40am

All OOMs in past was a bug or an issue with binding or OH itself in the past. And everything was fine.
At the end, we discuss here hardware related problem? Or is that an issue or missconfig? Or Bug?

Oliver2 · August 18, 2023, 12:21pm

Did you do some lightweight debugging yourself to narrow down where the problem might come from?

for instance you could:

disable one binding after another
disable Zigbee2MQTT, Grafana and InfluxDB one after the other
check your rules which include a timer if all is set correctly and no timers going “mad”. You could disable all rules, restart OH, and enable one rule after the other

If you see no dramatic increase/decrease from one step to another (and allow a couple of minutes between each step) then most likely your total configuration exceeds your PI’s capabilities.

Wikibear · August 18, 2023, 12:41pm

Yes I can do, if I have more free time. Sorry but invest now after this update 30h or more for fixing.
Zigbee Grafana and Influx isn’t the problem, as you can see that this comes from OH don’t make any sens.

Why should be timer as problem? I never had before a problem with any timer. Is something change with timers that this can happen?

As i wrote before it’s running and something must be happend. Maybe i can check in CPU which last rule will started that I can check if this is a rule problem. Sometimes after some hours i get something like that:

2023-08-18 10:48:19.767 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.thing.internal.CommunicationManager@4f72c9' takes more than 5000ms.
2023-08-18 10:48:29.496 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.internal.items.ItemUpdater@1838cf6' takes more than 5000ms.
2023-08-18 10:48:29.580 [WARN ] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected: ping timeout
2023-08-18 10:48:31.542 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-18 10:48:37.698 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = cc...cf, base URL = http://localhost:8080)
2023-08-18 10:48:47.239 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/temp failed!
2023-08-18 10:48:47.239 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/outertemp failed!
2023-08-18 10:48:47.246 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/outerhum failed!
2023-08-18 10:48:47.267 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/wzhtr failed!
2023-08-18 10:48:47.264 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/hum failed!
2023-08-18 10:48:47.326 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/sumtemp failed!
2023-08-18 10:48:47.377 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/co2 failed!
2023-08-18 10:48:47.414 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/press failed!
2023-08-18 10:48:47.426 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/uv failed!
2023-08-18 10:48:47.445 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/uv failed!
2023-08-18 10:48:47.520 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/dfw failed!
2023-08-18 10:48:47.534 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/lvr/dfe failed!
2023-08-18 10:48:47.586 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/winddir failed!
2023-08-18 10:48:47.642 [WARN ] [ing.mqtt.internal.action.MQTTActions] - MQTT publish to openhab/weather/windspeed failed!
2023-08-18 10:48:48.900 [INFO ] [.reconnect.PeriodicReconnectStrategy] - Try to restore connection to '192.168.179.65'. Next attempt in 60000ms
2023-08-18 10:48:50.906 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '192.168.179.65' with clientid OpenHAB

If there is no other solution to debug it directly, then I will do not more. I’m not interested to sit here hours for hours to find this issue if there is no other tool that support to find this issue, sorry. Then that’s my problem and I must live with it.

Maybe there is someone who has the same problems like me.

Oliver2 · August 18, 2023, 1:10pm

Your log shows you the symptoms of your problem, not the root cause.
As you saw in the other thread where other users had a problem with CPU being at 100%, it was not obvious to anyone in the beginning that the problem was a bug in JDK17 (which needs to be installed for OH4).

Maybe your problem has a similiar reason which we (including YOURSELF) need to find out. Maybe there is no problem and we might find out due to the changes of OH4 and the new system requirements (mainly JDK17 and buster) that a PI with 4GB is the new minimum requirement especially when using MQTT and Grafana. Who knows - we need to find out.

The majority of work in the beginning is on your side whereas the experts have the majority of work towards the end (bugfixing). What I want to say is that everybody has to contribute a little bit of work especially when there are problems with a new version.

You already got a few recommendations from experts in this thread above. Please go ahead and try to narrow it down so that the experts here have a starting point.

You said you do not have time to start analysing YOUR problem? But you already had time to write posts about your problem the last 7 days… (no offense - just realising)

Oliver2 · August 18, 2023, 1:15pm

I just like to add my memory consumption which is almost the same than your system:

16.2% of 4GB =~600MB
Your system: 56% of 1GB = ~600MB
At least we can say that you and me have no problem or we have the same problem

Would be interesting to know how much memory OH3.4 consumed.

Wikibear · August 18, 2023, 1:33pm

I respect everyone spend time to develop and I respect everyone how try to help me for fixing and I’m really glad that Openhab exist! That isn’t the core of all. You just see me here not often, but via Facebook I help to try many guys there if I can in the German Group. If you don’t see me here in the first row, that doesn’t mean that I’m a parasite who only criticism the system! But if you tell me, that I sit here to switch all rules off (90 rules) and Bindings (Dunno 15 but 120+ things) step by step for an event that happens not often but it does twice or more, than sorry I’m out! How many days I’m need for this? Month? Meanwhile OH4.1 is out…

I don’t say that’s your problem, i say exactly that’s my problem and I can only hope that someone else get in trouble, too. If i decide that this is riding the horse in hell and it don’t make sens to me to invest more time in something that nobody can do! Also you must respect me.

BTW OH3 and OH4 looks very similar with RAM consumption… If I remember right I was also round about 700mb… But with OH3 I had a ZWAVE system with ZWAVE.me solution also running and everything was OK. OH3 runs also with Z2M, ZWAVE, Grafana, Influx and an apache2 webserver… Everything was nice. But now i get in trouble with OH4. In the meanwhile i decided to kill ZWAVE and switch over to z2m completely. With OH4 there is more RAM free cause I don’t use ZWAVE anymore.

Ah, yeah an MQTT Broker for sure is running also…

Oliver2 · August 18, 2023, 1:42pm

All good Swen. You misunderstood - I am not complaining that you are not contributing in general. I am not complaining at all

I described one of potential approaches. But I would be interested to see how your system works if you disabled all rules which can be done easily through MainUI in one minute. Do you have access to MainUI at least shortly after a restart?

When upgrading from 3.4 to 4 did you follow the upgrade instructions so that the upgrade script was able to be run?

Wikibear · August 18, 2023, 1:58pm

Thanks for your help and understanding.

Yes sure. After update to 4.0.2-1 it looks more stable as before. But it would be nice to find the hick up also when I remove bindings and reinstall them. The DenonMarantz problem is yet fixed also what I get via SYSlog, but Openhab do the same issue after time.

No Openhab is running: