Very High Memory Usage and frequent restarts with OH4

It did not, so I edited the file.

thanks for your help.

I’m getting an error with this change:

openhabian@openhabian-spring4:~ $ journalctl -u openhab
-- Journal begins at Tue 2023-07-25 13:14:20 CDT, ends at Tue 2023-07-25 13:45:39 CDT. --
Jul 25 13:37:02 openhabian-spring4 systemd[1]: Started openHAB - empowering the smart home.
Jul 25 13:37:04 openhabian-spring4 karaf[885]: Error: Could not find or load main class “-Xms192m
Jul 25 13:37:04 openhabian-spring4 karaf[885]: Caused by: java.lang.ClassNotFoundException: “-Xms192m
Jul 25 13:37:04 openhabian-spring4 systemd[1]: openhab.service: Main process exited, code=exited, status=1/FAILURE
Jul 25 13:37:04 openhabian-spring4 systemd[1]: openhab.service: Failed with result 'exit-code'.
Jul 25 13:37:04 openhabian-spring4 systemd[1]: openhab.service: Consumed 1.394s CPU time.
Jul 25 13:37:10 openhabian-spring4 systemd[1]: openhab.service: Scheduled restart job, restart counter is at 1.
Jul 25 13:37:10 openhabian-spring4 systemd[1]: Stopped openHAB - empowering the smart home.
Jul 25 13:37:10 openhabian-spring4 systemd[1]: openhab.service: Consumed 1.394s CPU time.
Jul 25 13:37:10 openhabian-spring4 systemd[1]: Started openHAB - empowering the smart home.
Jul 25 13:37:11 openhabian-spring4 karaf[1444]: Error: Could not find or load main class “-Xms192m
Jul 25 13:37:11 openhabian-spring4 karaf[1444]: Caused by: java.lang.ClassNotFoundException: “-Xms192m
Jul 25 13:37:11 openhabian-spring4 systemd[1]: openhab.service: Main process exited, code=exited, status=1/FAILURE
Jul 25 13:37:11 openhabian-spring4 systemd[1]: openhab.service: Failed with result 'exit-code'.
Jul 25 13:37:11 openhabian-spring4 systemd[1]: openhab.service: Consumed 1.281s CPU time.
Jul 25 13:37:16 openhabian-spring4 systemd[1]: openhab.service: Scheduled restart job, restart counter is at 2.
Jul 25 13:37:16 openhabian-spring4 systemd[1]: Stopped openHAB - empowering the smart home.
Jul 25 13:37:16 openhabian-spring4 systemd[1]: openhab.service: Consumed 1.281s CPU time.

From my /etc/default/openhab file

#########################
## JAVA OPTIONS
## Additional options for the JAVA_OPTS environment variable.
## These will be appended to the execution of the openHAB Java runtime in front of all other options.
## 
## A couple of independent examples:
##   EXTRA_JAVA_OPTS="-Dgnu.io.rxtx.SerialPorts=/dev/ttyZWAVE:/dev/ttyUSB0:/dev/ttyS0:/dev/ttyS2:/dev/ttyACM0:/dev/ttyAMA0"
##   EXTRA_JAVA_OPTS="-Djna.library.path=/lib/arm-linux-gnueabihf/ -Duser.timezone=Europe/Berlin -Dgnu.io.rxtx.SerialPorts=/dev/ttyZWave"

# EXTRA_JAVA_OPTS="-XX:+ExitOnOutOfMemoryError"
EXTRA_JAVA_OPTS=“-Xms192m -Xmx768m”

bad quotes ?

I copied and pasted. I will try typing the quotes anew.

I was close. I tried retyping the space thinking it might not be the right space, but that didn’t help.
But the quotes I copied are open and close quotes. I replaced them with plain double quotes.

With that change, openhabian started w/o error. I will try to remember to report back on how it impacts the openHAB performance.

Thanks.

1 Like

Wow. Performance is vastly improved.

If I understand this correctly, the removal of -XX:+ExitOnOutOfMemoryError means that openHAB will keep running even when java.lang.OutOfMemoryError occurs? This could possibly be problematic on a production system, because it could be left in an unstable state.

Before upgrading to 4.0 I noticed that my system restarted once in a while, so I started wondering if my Raspberry Pi was getting too hot and just crashed, but then found this:

openhabian@openhabian:~ $ journalctl -u openhab
-- Journal begins at Mon 2023-06-26 12:33:31 CEST, ends at Sun 2023-07-02 14:22:04 CEST. --
Jul 02 14:20:23 openhabian karaf[2087]: Terminating due to java.lang.OutOfMemoryError: Java heap space
Jul 02 14:20:23 openhabian systemd[1]: openhab.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED
Jul 02 14:20:23 openhabian systemd[1]: openhab.service: Failed with result 'exit-code'.
Jul 02 14:20:23 openhabian systemd[1]: openhab.service: Consumed 2d 11h 1min 19.345s CPU time.
Jul 02 14:20:28 openhabian systemd[1]: openhab.service: Scheduled restart job, restart counter is at 6.
Jul 02 14:20:29 openhabian systemd[1]: Stopped openHAB - empowering the smart home.
Jul 02 14:20:29 openhabian systemd[1]: openhab.service: Consumed 2d 11h 1min 19.345s CPU time.
Jul 02 14:20:29 openhabian systemd[1]: Started openHAB - empowering the smart home.

So even though such undesired restarts are not really nice and somewhat risky, it might be even worse not to?

No. Unless you have a mem leak, OH will stop growing at some stage and you should let it.
Feel free to re-add the exitOnOOM option on your box but in general, no.

I’m sorry, but I don’t understand. A few things:

  • What do you mean by I should let OH stop growing? And how is this related to ExitOnOutOfMemoryError?
  • In case of a memory leak, for a production system there’s not much to do except restart and get another period stable until it happens again…?
  • In case of no memory leak, but still hitting the limit, this would probably still lead to instability, so why is it better to keep running?

For a development or test system, I fully agree that a restart would not be preferable because it could hide problems. I just don’t quite follow which benefits you see for keeping a production system running after encountering out of memory exceptions.

I mean you should let it grow, growth will come to an end.

That’s unrelated to ExitOnOutOfMemoryError but related to the -Xmx value (the mem limit to determine how quickly we end up with a mem error).

If there’s a mem leak then the component (a binding most of the time) to contain that would need to be identified and temporarily removed while waiting for someone to fix the OH code.
Restarting is not a fix, it’s a hack at best.

It’s a philosophy question if the java process should restart or return OOM. I tend to prefer to let it restart as then it’ll start working for at least some time while you cannot know what the OH code will do when a mem request gets denied with OOM.
So keeping the Exit… option is maybe the better choice, but increase -XmX

In that case you should be increasing the limit so it would not ever crash again.

Restarting or increasing memory limits are not fixes.

On this point we agree.

I realized too when upgrading to OH4.0.1 that the container is constantly restarting (thanks to iCloud reporting connection by email).
I then stepped back to OH3.4.4 unfortunately the same constant restarting is now on 3.4.4 too.
I am pretty sure this was not happening before
For my container of different OH versions, I have different mount points on my host system.
Prior to upgrading I always stop the running old version container copy the data to the new version location and start the new version OH container.
Therefore there is for sure no side effect that I have started OH4.0.1.
Though one thing that I did before starting the OH4.0.1 container is that upgraded the host system to the newest version by apt-get update/upgrade.
I at last started also my oldest version which I ran for about half a year 3.4.2 and had the same effect: running for a while and then dropping dead without any notice in the log.

Any idea how to get more information and debug? I am a little desperate.

tmp

'
++ cmp /openhab/userdata/etc/version.properties /openhab/dist/userdata/etc/version.properties
+ '[' '!' -z ']'
+ chown -R openhab:openhab /openhab
+ sync
+ '[' -d /etc/cont-init.d ']'
++ wc -l
++ ls '/usr/bin/s6-*'
+ '[' 0 == 0 ']'
++ find /etc/cont-init.d -type f
++ grep -v '~'
++ sort
+ sync
+ '[' false == false ']'
++ IFS=' '
++ echo gosu openhab tini -s ./start.sh
+ '[' 'gosu openhab tini -s ./start.sh' == 'gosu openhab tini -s ./start.sh' ']'
+ command=($@ server)
+ exec gosu openhab tini -s ./start.sh server
Launching the openHAB runtime...

It’s a philosophy question if the java process should restart or handle OOM when whatever limit is hit.
I tend to prefer to let it restart as then it’ll start working for at least some time while you cannot know what the OH code will do when a mem request gets denied with OOM. That’s probably coincidence I wouldn’t think it’s prepared to handle that correctly in all places.

I added -XX:ExitOnOutOfMemoryError back to openHABian. So people to read this, just keep this option when it’s in your system. Just make sure you increase the -Xmx value for OH4 to have more time before Java runs OOM and restarts.

1 Like

I don’t know if it is related, but I have had two episodes since I migrated to OH4 where OH was about halfway dead. Some bindings still worked and their things were online and rules would run. Other bindings stopped working completely and their things were offline. Restarting openHAB (systemctl restart openhab on my Pi) fixed everything. There wasn’t anything in the openhab or events logs about it.

I will put the EXTRA_JAVA_OPTS exit statement back in /etc/default/openhab. I assume it doesn’t matter whether if comes before or after the -Xms line.

I also see that my Java Heap space is increasing, for now no errors in logs.
Since I’ve setup monitoring for Java heap space:

The memory of my P4 (4GB) is not getting higher than 40%.

Do I also need to add this -XX:ExitOnOutOfMemoryError to the following line?
EXTRA_JAVA_OPTS="-Xms192m -Xmx768m"

Are these values (192 & 768) ok on every system, or do I need to adjust them for my Pi4 ?

If you want it to apply then yes.

And the other values? If I understand well, mx768m means that Java can use 768MB? But my Pi is only using 1,6GB of it’s 4GB.

You can increase it to a higher value. But too much memory isn’t good either. It will result in less but more ‘intense’ rounds of garbage collection (with risk of performance dips). Optimum probably depends on amount of addons and rules. I use 1 Gb.

On a 4G box you should not need to set it.

I’m now monitoring the Java Heap Size for a month and I my Pi reboots every 7 days.
The hours (12-24) before the restart I noticed that rules are no longer executed or data is not stored into influx anymore

The memory of my Pi is not getting higher than 40%

the problem is I don’t know Java at all, so I’m don’t know how to troubleshoot or fix this.

This topic was automatically closed 41 days after the last reply. New replies are no longer allowed.