CPU Load and RAM suddenly increase enormously

Jonas88 · February 24, 2025, 9:06pm

Proxmox LCX, Debian 12 (VM on a Synology NAS) - openHAB 4.3 - also tried Milestone OH 5

Hello!

at the moment i am transferring my Servers away from Docker / Synology to Proxmox.
I am having mixed success but openHAB didn’t cause that much problems - until yesterday at around 0 am. The CPU load increased to + 50 % and the ram maxed out.
Also the KNX binding was not working anymore (failed to connect to the router via tunnel).

I rolled back to a backup and everything worked fine until 60

min ago - same behaviour - knx binding also offline. However when i tried Router instead of Bridge it worked but the CPU load and Ram usage is still very bad.

At the time I was at the Milestone release - but then i tried going back to the stable version 4.3.

The drops in the ram usage are:

Restart
Clear Cache via openhabian
Going back to 4.3

I did some digging and found out that the heap size can cause issues so here’s my current values:

openhab> shell:info
Karaf
  Karaf version               4.4.6
  Karaf home                  /usr/share/openhab/runtime
  Karaf base                  /var/lib/openhab
  OSGi Framework              org.eclipse.osgi-3.18.0.v20220516-2155

JVM
  Java Virtual Machine        OpenJDK 64-Bit Server VM version 21.0.6+7-Debian-1
  Version                     21.0.6
  Vendor                      Debian
  Pid                         5360
  Uptime                      14 minutes
  Process CPU time            10 minutes
  Process CPU load            0.34
  System CPU load             1.00
  Open file descriptors       188
  Max file descriptors        102,642
  Total compile time          4 minutes
Threads
  Live threads                143
  Daemon threads              77
  Peak                        147
  Total started               612
Memory
  Current heap size           1,733,114 kbytes
  Maximum heap size           2,035,712 kbytes
  Committed heap size         1,994,752 kbytes
  Pending objects             0
  Garbage collector           Name = 'G1 Young Generation', Collections = 261, Time = 31.020 seconds
  Garbage collector           Name = 'G1 Concurrent GC', Collections = 90, Time = 5.948 seconds
  Garbage collector           Name = 'G1 Old Generation', Collections = 0, Time = 0.000 seconds
Classes
  Current classes loaded      21,716
  Total classes loaded        21,893
  Total classes unloaded      177
Operating system
  Name                        Linux version 6.8.12-8-pve
  Architecture                amd64
  Processors                  2
  Total physical memory       8,135,388 kbytes
  Free physical memory        672,128 kbytes

I hope you have any idea what i can try …

Jonas88 · February 26, 2025, 4:00pm

Okay …
Still i cannot get it to work …
I experience 100 % CPU Load and 100 % RAM usage even after a brand new installation of the stable build with openhabian. I tried a debian 12 and ubuntu 24 lxc.

I rolled back to my synology install (docker) since also on docker (inside an lxc) i had the same issues.

I am feeling pretty bad not coming any closer to a solution after XX hours of trying.

wborn · February 26, 2025, 5:48pm

If you run top -H it will show the thread names which may help with figuring out what causes the CPU load.

You can also get some nice details about what threads cause the CPU load in the Karaf console using:

ttop --stats=tid,name,state,user_time,cpu_time,user_time_perc,cpu_time_perc --order=cpu_time --millis=300

Executing the threads command on the Karaf console shows what part of the code is being executed by the threads.

Jonas88 · February 26, 2025, 6:27pm

Hello and thanks a lot for your reply!

I fired up the VM again and after about 3 minutes the ram load was growing steadily and the cpu locked at about 70 - 90 % constantly …

sihui · February 26, 2025, 6:33pm

BTW, I am able to load your Proxmox login page:

Jonas88 · February 26, 2025, 6:34pm

yes i am going to fix that with authelia in the near future Thanks for the hint

Jonas88 · February 26, 2025, 6:39pm

By the way: That is the main reason why i am transferring away from synology since the installed reverse proxy locks port 80 so i cannot bind that to traefik for certificate renewal. and so i cannot use authelia. another reason is the lack of ipv6 support for docker on synology.

Jonas88 · February 26, 2025, 7:12pm

I guess it was the SLAAC setting in the Network-Settings (I enabled because the matter binding wasn’t working) but it seems to overload the ipv6 mdns service … Uptime is only 20 min now but this was enough before to drive the load and ram up to 100 % … thanks for the hint with the ttop --stats=tid,name,state,user_time,cpu_time,user_time_perc,cpu_time_perc --order=cpu_time --millis=300 → this showed clearly the problem with mdns.

wborn · February 26, 2025, 7:15pm

Nice! Why does it only have one CPU? Might be a bit restrictive for an application like OH that has lots of threads. On the Synology it probably had more CPUs available?

Jonas88 · February 26, 2025, 7:33pm

yes. I just gave it one for testing (my productive environment of OH also runs on docker also on the synology) and since it went to 100 % every time i restricted it to one core so my system doesn’t freeze completely. I have 4 cores available and allow it 2 - which is enough normally. i had to set it to static. i tried dhcp but was not able to log into the console from the proxmox. i dont’t know if dhcp ipv6 is required for matter though … my router and dhcp server (unifi usg) does not support ipv6 at all i think.

wborn · February 26, 2025, 7:36pm

It is see:

Jonas88 · February 26, 2025, 7:43pm

Yeah IPV6 is mandatory. I just don’t know if it will work with a static ipv6 my router doesn’t support ipv6 - i’ll give it a try.
I guess so because i got it to work last week. but even if i don’t - better a stable OH than getting matter to work