How to find the cause of a OutOfMemoryError in OH2?

Hi,

apparently my OH2 was down, even so I was sure i left it yesterday up & running. Short check:

pi@openhab:/etc/openhab2 $ sudo systemctl status openhab2.service
● openhab2.service - openHAB 2 - empowering the smart home
   Loaded: loaded (/usr/lib/systemd/system/openhab2.service; disabled)
   Active: active (running) since Sat 2017-03-11 19:59:11 CET; 23h ago
     Docs: http://docs.openhab.org
           https://community.openhab.org
 Main PID: 4964 (karaf)
   CGroup: /system.slice/openhab2.service
           ├─4964 /bin/bash /usr/share/openhab2/runtime/bin/karaf server
           └─5121 /usr/bin/java -Dopenhab.home=/usr/share/openhab2 -Dopenhab.conf=/etc/openhab2 -Dopenhab.runtime=/usr/share/openhab2/runtime -Dopenhab.userdata=/var/l...

Mar 12 12:16:49 openhab start.sh[4964]: Exception in thread "qtp18811918-6453" java.lang.OutOfMemoryError: Java heap space
Mar 12 12:18:09 openhab start.sh[4964]: SLF4J: Failed toString() invocation on an object of type [org.eclipse.jetty.util.thread.QueuedThreadPool]
Mar 12 12:18:30 openhab start.sh[4964]: Exception in thread "qtp18811918-6454" java.lang.OutOfMemoryError: Java heap space
Mar 12 12:22:15 openhab start.sh[4964]: SLF4J: Failed toString() invocation on an object of type [org.eclipse.jetty.util.thread.QueuedThreadPool]
Mar 12 12:22:46 openhab start.sh[4964]: java.lang.OutOfMemoryError: Java heap space
Mar 12 12:25:53 openhab start.sh[4964]: Exception in thread "ServletModel-9-103" java.lang.OutOfMemoryError: Java heap space
Mar 12 12:29:32 openhab start.sh[4964]: Exception in thread "qtp18811918-6452" SLF4J: Failed toString() invocation on an object of type [org.eclipse.jetty.ut...hreadPool]
Mar 12 12:29:32 openhab start.sh[4964]: java.lang.OutOfMemoryError: Java heap space
Mar 12 12:29:32 openhab start.sh[4964]: java.lang.OutOfMemoryError: Java heap space
Mar 12 12:31:05 openhab start.sh[4964]: Exception in thread "HttpClient@12598806-6412" java.lang.OutOfMemoryError: Java heap space
Hint: Some lines were ellipsized, use -l to show in full.

The log file looks like a battlefield. Almost no way to find the first one. And if - the first OoM is not connected to the root cause.

So: How do i find the memory leak?

The last change was to sucesfully update via apt-get after 2 month or so. Therefore i suspect simply a bug. What can i do to track it down?

Best
Michael

If you are Java programmer hook up a Profiler and see what part of the system has a memory leak.

If not you can remove your addons one at a time until the OOM error goes away. Then you will know what addon is causing the problem. If none of them do you know the bug is in the core.

I’ve seen a couple of other OOM errors posted to this forum over the past couple of weeks so there may be a bug people are running into.

Another possibility. Use the JVM -XX:+HeapDumpOnOutOfMemoryError command line option (must edit the openHAB startup script). More information is here. The article also describes several other ways to manually request a heap dump of a running Java process. Depending on the operating system and your scripting skills you could externally monitor the process memory usage and trigger a heap dump based on some criteria.

After the OH2 JSR223 feature is merged and released, you’ll be able to write a script to monitor the JVM memory usage and trigger a heap dump using JMX MBeans. I’ve used this technique to diagnose an Sonos-related OOM in OH1. I then used the Eclipse Memory Analyzer for the heap analysis.

Also watch for high thread counts. They can cause an OOM from native code (due to thread stack memory usage) even if the JVM doesn’t appear to be using excessive memory. The Sonos binding bug, for example, was an issue with runaway thread creation.

@steve1 the -XX:+HeapDumpOnOutOfMemoryError sounds like the most reasonable option to me.

But after tracking down startup files and googling around, i still cant find it.

Any hint? (And any hinf where i could have found it would be appreciated as well :flushed:)

Is there a /etc/default/openhab2 file on your system? If so, you can add extra JVM arguments to that script. It is processed before the openHAB process is started. There’s some related information here.

That’s where I put my JSR223-related arguments. I’m hoping that the apt-get installation doesn’t overwrite it. (If anybody knows of a better place to put custom JVM arguments that won’t be overwritten by apt-get, let us know!)

1 Like

@kohlsalem You can also let it run for some time and then create a heap dump from the openhab console with the dev:dump-create command. This may take several minutes when you run it on a Raspberry Pi.

This will create a ZIP file (e.g. 2017-03-13_221953.zip) with diagnostic data in your userdata directory. This file will also contains the heap dump as heapdump.txt . The extension of this file should actually be .hprof so you can then open it in any program that can open .hprof files. I also usually analyze the heap dump with Eclipse Memory Analyzer.

2 Likes

Might be unrelated, but:
I had my system going bananas here the other day, terminating with an out of memory error.
OH2 restarts and even PC reboot had OH2 hogging 400% CPU.

Turned out it was a Chrome Browser Tab open with Basic UI during a restart/reboot that caused it.
Closed the Browser tab, and things settled.
The connection was from my workplace through nginx, but i think I have seen it once on a direct LAN connection as well.

Snapshot from a few days ago.

1 Like

Thanks! After the “good-morning-OoM-Error-Reboot” i placed:

EXTRA_JAVA_OPTS="-XX:+HeapDumpOnOutOfMemoryError"

lets see :slight_smile:

@OMR hmm. For sure i had a chrome tab with basic ui open tonight.

It shouldn’t take down OH, should it?

Shouldn’t no, but it does it anyway! :slight_smile:

@OMR :smiley:

did you already report an issue?

No, haven’t had the time to follow up on it.

Did closing the browser tab fix it for you too?

@OMR i rebooted before closing :slight_smile: :frowning:

Can you please double check whether you’ve installed the following addon?

openhab> feature:list | grep Front

openhab-binding-fsinternetradio | 0.9.0.b2 | x | Started | openhab-aggregate-xml | Frontier Silicon Internet Radio Binding

This addon creates in my environment (OH2, 2.0.0.b5) permanently new threads called HttpClient until the System runs OOM.

You can easily check this by calling dev:dump-create and check amount of threads in threads.txt (1st line)

1 Like

for sure not…

my add-ons are this ones: https://github.com/kohlsalem/openhab/blob/master/services/addons.cfg

Can you check this (@karaf):

shell:threads | grep HttpClient | wc -l

currently 41

update: i had several OutOfMem’s, but since i added the -XX:+HeapDumpOnOutOfMemoryError : none…

Not sure if this is got or bad :slight_smile:

Did this actually solve your issue?

It doesn’t solve memory leaks. :wink: It just creates a (.hprof) file containing a memory heapdump whenever it runs out of memory. That way you can analyze the memory contents in a program like Eclipse MAT.