openHAB 3 runs out of memory / java heap space errors, CPU 100%+ after a few hours

Are those DSL scripts created via the UI or from files? I seem to have similar problems with those from the UI…

The UI. And I am quickly realizing there is a potential huge problem with this. These two DSL scripts, composed in the UI, took 4 to 7 seconds to execute. During those 4-7 seconds, the CPU would be over 100%. And after several hours of normal operation (scripts running every 5-10 minutes), the whole thing would just keel over dead out of memory and CPU pegged.

The exact same operation performed by an ECMA script composed in the UI executes so fast the indicator doesn’t even change to running. This has been running for hours now honestly this is the best I’ve ever seen my install of OH3 operate. Everything it is does is instant and nothing is bogging down.

What or why this problem exists is well over my head, but process of elimination here seems to suggest it exists.

You posted only parts of the rules, which triggers are you using and with what settings?

That’s what i also saw. In my case the simplest rules from the UI in DSL take up to 10s with high CPU load, but not neccessarely high memory usage. I reduced how often these rules run but i think i only increased the time between the crashes.

Those are scripts, in their entirety. There are separate rules that trigger them. Those triggers are either item state changes, or an every 5 minutes cron.

1 Like

I experience the same problem. I’ve created a new topic, but I think I can let it merge with this topic.
[OH3] high cpu load, unresponsive OH

Since i now moved all my DSL rules in files again, my cpu load looks stable and extremely low! I think that did the trick in my case!

(same rules, same triggers)

One git issue has to be declared if not existing.

1 Like

I though I have my problems tackled but I still have a high CPU load.
Can someone run a

shell:threads - - list

in the karaf console and see what are the top CPU time consumers? For me DirWatcher and RuleRefresher are suspiciously high!
I use file-only DSL rules, but I also use jython with the (fixed) helper libraries…
When watching the event stream in the debug sidebar, I see sometimes that rules get reloaded even though I did not touch the configuration…
I also still see some org.eclipse.x bundles when I do bundle:list -s. Shouldn’t they be gone?

I tried one more thing and it looks like it helped:

sudo apt purge openhab2

Followed by a reboot.
I upgraded with the openhabian config tool, but somehow the were still leftovers… Now it’s running like a charm.

For me this doesn’t apply unfortunately, because I run OH3 in a docker container, so I have a “clean” environment

in my case the RXTXPortMonitor is the highest, but none of the threads are suspiciously high.

@Pedals2Paddles: Did you already create a git issue?

I’m observing a similar issue: round about 24 hours of running openhab3 in a docker container (clean install, setup from scratch) the VM throws an out-of-memory error and becomes slow beforehand. I suspect heavy rule-execution or exec-binding at the moment and will disable those one by one and hopefully narrow it down a bit…

Same question, do you have set up DSL rules via the UI? If so, put them in files like you did in OH2. That did the trick at my case and since then i have extremely low CPU and everything is stable

please post link to git issue so we can play along at home
thanks

2 Likes

I don’t get memory leaks but i have a similar situation with long waiting issues of dispatch events:

dispatching event to subscriber ‘org.openhab.core.internal.items.ItemUpdater@1e6b6a0’ takes more than 5000ms

I will update all items to model simantics and it will go slower and slower…

Openhab will lost all network connections. No network things will be reached. After reboot everything is fine.

That actually fixed it for me. Up and running for 4 days now without issues. I only moved 2 highly-frequented rules back to files, that seems to do the trick.

That problem is solved in the current snapshots!

What specifically has been fixed?