I migrated my OH 3.4 to the latest 4.0 stable version using openhabian. I have a RP4 8 GB version. No other changes since 3 months (besides widgets and one page). I still use (and prefer) file based config where possible.
While OH 3.4 run quite fast, OH 4.0 is not able to fully start with my massive amount of things, items and rules. I read trough the forum already and extended zram by factor 3 as RAM (max 8 GB) and memory (max 128GB) shpould not be a limit anyhow. which just little improvement.
Before rebooting log shows me especially something like:
2023-10-24 20:35:00.754 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber 'org.openhab.core.io.monitor.internal.EventLogger@182bd8c' takes more than 5000ms.
Some infos that might help:
sudo systemctl status openhab
Gives:
â openhab.service - openHAB - empowering the smart home
Loaded: loaded (/lib/systemd/system/openhab.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/openhab.service.d
ââoverride.conf
Active: active (running) since Tue 2023-10-24 20:35:38 CEST; 7min ago
Docs: https://www.openhab.org/docs/
https://community.openhab.org
Main PID: 10472 (java)
Tasks: 343 (limit: 4915)
CPU: 8min 58.356s
CGroup: /system.slice/openhab.service
ââ10472 /usr/bin/java -XX:-UsePerfData -Dopenhab.home=/usr/share/openhab -Dopenhab.conf=/etc/openhab -Dopenhab.runtime=/usr/share>
For me it looks like openhab uses one core and this is full.
I tried loading one rule after the other. This works but loading the rules takes EXTREMLY long!
I also found this post.
As an example the following file takes about 20 sec until CPU load goes down.
rule "Statistik Fahrradschuppen"
when
Item TuerschlossantriebFahrradschuppen1LOCKSTATE changed
then
if (BootFinished.state != ON) {
return;
}
if (newState.toString == "LOCKED") {
Statistik_Fahrradschuppen_Verriegeln.postUpdate((Statistik_Fahrradschuppen_Verriegeln.state as Number)+ 1 )
}
if (newState.toString == "UNLOCKED") {
Statistik_Fahrradschuppen_Entriegeln.postUpdate((Statistik_Fahrradschuppen_Entriegeln.state as Number)+ 1 )
}
end
rule "Statistik Gartenschuppen"
when
Item TuerschlossantriebGartenschuppen1LOCKSTATE changed
then
if (BootFinished.state != ON) {
return;
}
if (newState.toString == "LOCKED") {
Statistik_Gartenschuppen_Verriegeln.postUpdate((Statistik_Gartenschuppen_Verriegeln.state as Number)+ 1 )
}
if (newState.toString == "UNLOCKED") {
Statistik_Gartenschuppen_Entriegeln.postUpdate((Statistik_Gartenschuppen_Entriegeln.state as Number)+ 1 )
}
end
rule "Statistik Garagentor"
when
Item Garagentor1DOORSTATE changed
then
if (BootFinished.state != ON) {
return;
}
if (newState.toString == "OPEN") {
Statistik_Garagentor_Oeffnen.postUpdate((Statistik_Garagentor_Oeffnen.state as Number)+ 1 )
}
if (newState.toString == "CLOSED") {
Statistik_Garagentor_Schliessen.postUpdate((Statistik_Garagentor_Schliessen.state as Number)+ 1 )
}
if (newState.toString == "VENTILATION_POSITION") {
Statistik_Garagentor_Lueften.postUpdate((Statistik_Garagentor_Lueften.state as Number)+ 1 )
}
end
You have plenty of RAM so the usual culpret is excluded.
You are using Rules DSL so the known problem with JS Scripting on 32-bit ARM is not the problem.
The fact that OH is pegging the CPU on a single thread is suspicious. Iâm pretty certain that Java allows use of multiple threads (i.e. it will spread execution across multiple CPUs so the fact that only one core is pegged implies that there is only one thread in OH that has run amok.
First thing I would do is set the logging level to trace level logging and see if you can see something churning over and over again.
If that doesnât show anything try eliminating add-ons one by one until the behavior stops or you run out of add-ons.
If you do run out of add-ons, review all your rules, maybe even add some logging, so see if you have a single rule that is running amok.
Thatâs about all I can offer given the information available.
As I wrote - it is loading the rules files. I can repeat this for every rule file I have and I see the same behaviour. I guess it is parsing the rule file and not the rule logic itself.
Everything! Set the logging level of org.openhab to TRACE and everything will get set to TRACE unless otherwise set to something else.
Itâs always been slow to parse .rules files. And nothing changed in that regard between OH 3 and OH 4. Thereâs nothing to explain why that makes a difference.
I took the change to fully setup a fresh openhabian 4 system. Things got better a lot, nevertheless especially rule paring is still much below âoldâ OH3. I am kind of lost!
Took a time to start analysing. Trinstalling OH 4 completely fresh caused improvement and more cores bing used. Nevertheless, the system is extremely instable as soon as I start changing ANYTHING on a file. This causes often OH to restart itself. The identical files have worked fine on OH 3 => very unlogic for me.
I activated TRACE but then generate log files that long that they are not readable any more. What is a good idea here to continue?
It could really depend on many things. Can you compare number of threads started by OH 3 vs OH 4? Maybe some of bindings changed threading model.
As far I remember there were some optimizations to filesystem watches to limit resource consumption.
When I activated the console, I got this but openhab has NOT restarted! By the way: I have a lot of debug output when opening openhabian-config even though I never activated this and use a fresh setup!?
When system performance is bad, meaning OH stops responding, I can actually even not connect to the OH console. Is this normal in this case or indicating something?
Most of the cases when JVM process becomes unresponsive is due to memory leak. Then most of CPU time is spent on memory management and not on actual program execution. However, then you should see GC threads in the listing.
You can look at command line options of Java process (cat /proc/$PID/cmdline), you can verify /etc/default/openhab and setenv, oh2_dir_layout scripts within your local installation.
Anything which indicates -Xmx or similar is explicit setting of memory limit. If you havenât changed that then you run on defaults which were fine until you switched to 4.0.
@Matthias_Kaufmann Iâm facing very similar issues. Did you make any progress in the meantime? Iâll try to debug this using JMX. Letâs hope weâll find the culprit soonâŠ