How to find the cause of a OutOfMemoryError in OH2?

Tags: #<Tag:0x00007fc8fc2a6170>

I just got another memory leak and a java… file has been created.
So what to do with it?
for me it looks like crap :wink:

JAVA PROFILE 1.0.1^@^@^@^@^D^@^@^Af¦/e^Z^A^@^@^@^@^@^@^@-`I<83>^X(II)Ljava/time/temporal/TemporalAdjuster;^A^@^@^

Checked with Eclipse MAT:

But how to proceed from here?

It looks like there is some kind of memory leak in the sitemap events mechanism.

If Eclipse MAT shows the same “Problem Suspect” with recent openHAB 2.4.0 builds, we should create an issue for this so it is solved in Eclipse SmartHome.

hmmm… sitemap… I don’t use any sitemap since I switched to HABpanel…!?

HABpanel is using the same Server Sent Event (SSE) mechanism to get notified of updates like Basic UI and Paper UI. According to the heapdump analysis it was consuming more than half of all the memory used by openHAB. Was that heapdump made with openHAB 2.4.0? Do you have a lot of devices running HABpanel?

Yes according to my notes this was with M4 - although I did have the heap dump issue with 2.3 stable as well.

I am running Habpanel just on my cell phone and a wall mounted tabled (different panel config).

Thanks for pointing this out. I will try to eliminate the habpanel access for both devices to nail the root cause down.

1 Like


info: openhab running on a pi3+

Recently, i’ve updated my openhab from 2.3to 2.4. I aslo i added spotify (+market) binding, adjusting stuff in habpanel and sitemap for the new binding and did some more rule. After a couple days i’ve notice that habpanel wouldn’t load anymore. I checked the log and see the famous java heap problem:

 2019-10-13 09:58:02.015 [INFO ] [ui.habmin.internal.servlet.HABminApp] - Started HABmin servlet at /habmin
2019-10-13 09:58:08.463 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to '**IPADDRESS**' with clientid paho79431135232 and file store '/var/lib/openhab2/mqtt/**IPADDRESS**'
2019-10-13 09:58:25.412 [INFO ] [mmandclass.ZWaveDoorLockCommandClass] - NODE 6: Door-Lock config report - timeoutEnabled=false timeoutMinutes=254, timeoutSeconds=254
2019-10-13 17:56:39.655 [ERROR] [o.client.mqttv3.internal.ClientState] - paho79431135232: Timed out as no write activity, keepAlive=60,000 lastOutboundActivity=1,571,003,534,821 lastInboundActivity=1,571,003,670,914 time=1,571,003,703,024 lastPing=1,571,003,534,821
2019-10-13 18:13:44.897 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@2d4e16 in QueuedThreadPool[HttpClient@e2509f]@1bad41e{STARTED,8<=8<=200,i=5,q=0}[ReservedThreadExecutor@12d80b{s=0/4,p=0}]
2019-10-13 18:25:36.709 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@8bcebf in QueuedThreadPool[HttpClient@1c2769d]@1c01699{STARTED,8<=8<=200,i=5,q=0}[ReservedThreadExecutor@b450f6{s=0/4,p=0}]
2019-10-13 18:25:28.228 [WARN ] [] - Dispatching event to subscriber 'org.eclipse.smarthome.core.internal.items.ItemUpdater@16525e8' takes more than 5000ms.
2019-10-13 18:25:21.105 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@1c5c303 in QueuedThreadPool[HttpClient@936858]@c4d222{STARTED,8<=8<=200,i=5,q=0}[ReservedThreadExecutor@1fa50a3{s=0/4,p=0}]
2019-10-13 18:26:49.040 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@1a57005 in QueuedThreadPool[HttpClient@152e1d0]@436c3d{STARTED,8<=8<=200,i=5,q=0}[ReservedThreadExecutor@1629801{s=0/4,p=0}]
2019-10-14 03:00:45.261 [WARN ] [] - Dispatching event to subscriber '' takes more than 5000ms.
2019-10-14 03:08:12.668 [ERROR] [WaveSerialHandler$ZWaveReceiveThread] - Exception during ZWave thread. 
java.lang.IllegalStateException: Queue full
	at java.util.AbstractQueue.add( ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.add( ~[?:?]
	at org.openhab.binding.zwave.internal.protocol.ZWaveTransactionManager.processReceiveMessage( ~[240:org.openhab.binding.zwave:2.4.0]
	at org.openhab.binding.zwave.internal.protocol.ZWaveController.incomingPacket( ~[240:org.openhab.binding.zwave:2.4.0]
	at org.openhab.binding.zwave.handler.ZWaveControllerHandler.incomingMessage( ~[240:org.openhab.binding.zwave:2.4.0]
	at org.openhab.binding.zwave.handler.ZWaveSerialHandler$ [240:org.openhab.binding.zwave:2.4.0]
2019-10-14 17:27:40.840 [WARN ] [org.eclipse.jetty.server.HttpChannel] - /habpanel/app/widgets/frame/frame.widget.js
java.lang.OutOfMemoryError: Java heap space
2019-10-14 17:27:40.840 [WARN ] [org.eclipse.jetty.server.HttpChannel] - /habpanel/app/widgets/chart/chart.widget.js
java.lang.OutOfMemoryError: Java heap space
2019-10-14 17:28:06.315 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - 
java.lang.OutOfMemoryError: Java heap space
//and so on...

It’s talking about Z-wave in this example, but in other it’s talking about MQTT and in another it’s the execution of a rule that normally is woking normally. I think it’s only a coincidence that this program call something and the java heap space is already full.

Since the first error, i’ve uninstall openhabcloud, market, spotify, network and astro bindings. I’ve commented all the code referring to those even in the sitemap. With no result at all.

I’ve doubled the max heap size and activate creation of .hprof in the file /etc/default/openhab2:

EXTRA_JAVA_OPTS="-Xms250m -Xmx700m"

The heap size hasn’t done anything, it’s still give me the error ususally a ocuple of hours after a reboot. Here’s the result of the .hprof analyser: ( I analyse 3 differrent .hprof with the same result)

But this i cant really make sense if someone could help me.

There is also this error file i found that give some guideline that i’ll to follow next but i don’t have much faith in that…

hs_err_pid30247.log (16.3 KB)

I’m at a point where i would just wipe out the pi and do it all over. It wouldn’t be that much work but i fear that problem would be back after a while…

So if someone could shed some light upon this i think it would help a lot of people


There are reports of memory leaks in several threads but no one has yet been able to pinpoint the cause. We are under the assumption at this point that the source of the leak is not in OH core or else far more users would be reporting them so definitely focus you attention on the add-ons. I agree, the add-on that first encounters the OutOfMemoryError is not likely to be the source of the leak.

You have somewhat of an advantage as you run out of memory a couple of hours after reboot instead of once a month. So you are in a better position than others to systematically uninstall add-ons (don’t forget persistence, actions, and transformations) to see if one seems to make a difference in how quickly it runs out of memory. Given how fast it runs out I would not be surprised if you have more than one.

Ok then let’s do it.

Won’t be long, this is all that is left

# A comma-separated list of bindings to install (e.g. "binding = sonos,knx,zwave")
binding = zwave,kodi,nest,mqtt,exec

# A comma-separated list of UIs to install (e.g. "ui = basic,paper")
ui = paper,habmin,habpanel,basic,classic

# A comma-separated list of persistence services to install (e.g. "persistence = rrd4j,jpa")
#persistence =

# A comma-separated list of actions to install (e.g. "action = mail,pushover")
action = mail

# A comma-separated list of transformation services to install (e.g. "transformation = map,jsonpath")
#transformation =

# A comma-separated list of voice services to install (e.g. "voice = marytts,freetts")
#voice =

# A comma-separated list of miscellaneous services to install (e.g. "misc = myopenhab")
misc =

Tonight is a fisrt i wasn’t even able to open a share or enter in the piwith ssh. No mention of java since my last reboot though:

2019-10-15 07:08:05.346 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - 
java.lang.OutOfMemoryError: Java heap space
2019-10-15 21:26:26.686 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel
2019-10-15 21:26:27.213 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Time zone set to 'SOMEWHERE'.
2019-10-15 21:26:27.229 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Location set to 'SOMEWHERE'.
2019-10-15 21:26:27.238 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Locale set to 'SOMETHING'.
2019-10-15 21:26:46.883 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'nest.items'
2019-10-15 21:26:47.162 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'kodiSalon.items'
2019-10-15 21:26:47.319 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'spotify.items'
2019-10-15 21:26:47.354 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'astro.items'
2019-10-15 21:26:47.442 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'network.items'
2019-10-15 21:26:47.476 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'zwave.items'
2019-10-15 21:26:47.566 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'kodiSousSol.items'
2019-10-15 21:26:47.693 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'arduino.items'
2019-10-15 21:26:47.732 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'exec.items'
2019-10-15 21:26:58.002 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'kodiSalon.rules'
2019-10-15 21:26:58.919 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'notifications.rules'
2019-10-15 21:26:59.635 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'maison.rules'
2019-10-15 21:27:01.502 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'laveVaisselle.rules'
2019-10-15 21:27:01.612 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'spotify.rules'
2019-10-15 21:27:04.207 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'kodiSS.rules'
2019-10-15 21:27:04.391 [WARN ] [el.core.internal.ModelRepositoryImpl] - Configuration model 'spotify.rules' is either empty or cannot be parsed correctly!
2019-10-15 21:27:04.674 [INFO ] [thome.model.lsp.internal.ModelServer] - Started Language Server Protocol (LSP) service on port 5007
2019-10-15 21:27:05.349 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'devSitemap.sitemap'
2019-10-15 21:27:07.308 [INFO ] [ing.zwave.handler.ZWaveSerialHandler] - Connecting to serial port '/dev/ttyACM0'
2019-10-15 21:27:08.577 [INFO ] [ing.zwave.handler.ZWaveSerialHandler] - Serial port is initialized
2019-10-15 21:27:08.653 [INFO ] [ve.internal.protocol.ZWaveController] - Starting ZWave controller
2019-10-15 21:27:08.655 [INFO ] [ve.internal.protocol.ZWaveController] - ZWave timeout is set to 5000ms. Soft reset is false.
2019-10-15 21:27:10.191 [INFO ] [.dashboard.internal.DashboardService] - Started Dashboard at http://IPADDRESS:8080
2019-10-15 21:27:10.195 [INFO ] [.dashboard.internal.DashboardService] - Started Dashboard at https://IPADDRESS:8443
2019-10-15 21:27:11.691 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /paperui
2019-10-15 21:27:11.870 [INFO ] [ui.habmin.internal.servlet.HABminApp] - Started HABmin servlet at /habmin
2019-10-15 21:27:19.382 [INFO ] [.transport.mqtt.MqttBrokerConnection] - Starting MQTT broker connection to 'IPADDRESS' with clientid ID and file store '/var/lib/openhab2/mqtt/IPADDRESS'
2019-10-15 21:27:23.220 [INFO ] [mmandclass.ZWaveDoorLockCommandClass] - NODE 6: DEVICE config report - timeoutEnabled=false timeoutMinutes=254, timeoutSeconds=254
2019-10-16 02:27:07.636 [WARN ] [.serialmessage.ZWaveCommandProcessor] - SerialMessage class null is not implemented!
2019-10-16 02:27:07.639 [WARN ] [.serialmessage.ZWaveCommandProcessor] - SerialMessage class null is not implemented!
2019-10-16 02:27:07.641 [WARN ] [ve.internal.protocol.ZWaveController] - TODO: Implement processing of Request Message = -- (0x0)
2019-10-16 05:24:31.780 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel
2019-10-16 05:24:31.815 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Time zone set to 'SOMEWHERE'.
2019-10-16 05:24:31.849 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Location set to 'SOMEWHERE'.
2019-10-16 05:24:31.858 [INFO ] [.core.internal.i18n.I18nProviderImpl] - Locale set to 'SOMETHING'.
2019-10-16 18:29:32.951 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'nest.items'

Yesterday, after the post, i’ve reduce heap size to:
EXTRA_JAVA_OPTS="-Xms64m -Xmx128m"

By default, it’s was
EXTRA_JAVA_OPTS="-Xms250m -Xmx300m"

And i tried without success:
EXTRA_JAVA_OPTS="-Xms250m -Xmx700m"

I’ll let it crash again to see if java is error come back, then i’ll begin to uninstall add-ons.

That points to something else might be going on on the machine itself that may be independent and in addition to a memory leak in OH. Though setting the max heap size to 700 might be way too high for an RPi which doesn’t have that much RAM to begin with. Assuming an RPi 3, that only leaves around 300 MB of RAM to run everything else on the machine.

ok so the java error is back

2019-10-16 18:29:32.951 [INFO ] [el.core.internal.ModelRepositoryImpl] - Loading model 'nest.items'
     *** some startup event ***
2019-10-17 02:39:00.574 [ERROR] [o.client.mqttv3.internal.ClientState] - paho8400789633: Timed out as no write activity, keepAlive=60,000 lastOutboundActivity=1,571,293,516,522 lastInboundActivity=1,571,293,657,254 time=1,571,293,980,058 lastPing=1,571,293,516,522
2019-10-17 02:39:05.710 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@124bd01 in QueuedThreadPool[HttpClient@7c76b]@59107{STARTED,8<=8<=200,i=5,q=0}[ReservedThreadExecutor@11b29fd{s=0/4,p=0}]
2019-10-17 02:39:04.032 [ERROR] [ntime.internal.engine.RuleEngineImpl] - Rule 'Lave Vaisselle PhotoValue String to Float': An error occurred during the script execution: Java heap space
2019-10-17 03:11:40.073 [ERROR] [org.apache.felix.fileinstall        ] - In main loop, we have serious trouble
java.lang.OutOfMemoryError: Java heap space
2019-10-17 03:11:36.322 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - 
java.lang.OutOfMemoryError: Java heap space

So rebooted at 18:29 and bug happen at 2:39 = took around 8 hours to bug

I’ll remove mail action first, i would be surprise it would causing that.

*** edit 2019-10-18***

2019-10-17 19:44:43.651 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Started Paper UI at /paperui
*** some startup event ***
2019-10-18 04:11:28.386 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.OutOfMemoryError: Java heap space

19:44 to 04:11 = 8:30 to bug

Now i removed exec

*** edit 2019-10-19***

2019-10-18 11:24:09.949 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel
*** some startup event ***
2019-10-18 19:52:03.405 [WARN ] [mmon.WrappedScheduledExecutorService] - Scheduled runnable ended with an exception: 
java.lang.OutOfMemoryError: Java heap space

11:24 o 19:52 = 8:30 to bug

next mqtt

*** edit 2019-10-19***

2019-10-19 08:42:46.013 [INFO ] [panel.internal.HABPanelDashboardTile] - Started HABPanel at /habpanel
    *** some startup event ***
2019-10-19 17:22:27.350 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$2@1715177 in QueuedThreadPool[HttpClient@9aed1c]@fede15{STARTED,8<=8<=200,i=5,q=0}[ReservedThreadExecutor@14ed701{s=0/4,p=0}]

8:42 to 17:22 = 8:40

next nest

1 Like

For info, it has been 12 hours without the nest binding and openhab is still running without problem. It’s been a long time since i saw that!

If it’s still running till monday night, i’ll begin to re-install the other bindings. Do you think it’s a good idea?

Well, it’s still autumn. Ask again in December :cold_face:

Instead of installing my bindings back , i’ve decided to install some monitoring using another Pi and the TIG stack (telegraf - influxdb - Grafana). I was hoping to monitor java but in the end it seems a bit over my head. I know i can do it with jolokia and telegraf but i don’t have any idea how to install that jolokia thing. I think you have to install it in the java vm or app but i’m not a developer… If someone is able to implement that in openhab someway I would love to monitor that java stuff.

anyway here what the monitoring look like in grafana. At least i got the memory of the pi:

All credit to the Grafana dashboard # 1375

As for the java problem it’s been over a week without problem. Over the next few days i’ll reinstall my bindings and run that without nest for a while. I’ll probably won’t reinstall nest but we’ll see.

I’ll come back with the results

If you go with Prometheus instead of Telegraph there is a binding that gives you all sorts of information about OH internals. See New Add-on bundle for Prometheus health Metrics

I saw prometheus later after some research on snmp for my Q-nap, I’ll give it a try thanks.

Seems great but it doesn’t seems to work with openhab2.4

I got an error when openhab try to install one of the .jar file.

I’ve managed to make this one work but it only collect item states, which is not really helpfull.

I might try to built around that but i’m not sure how far i can go…

Be sure to leave a comment on the thread I linked to. The guys who put it together are pretty responsive but might not see this post.

true, will do.

As for my java memory problem, I’ve reinstalled all my stuff last saturday :

  + spotify beta
With all the corresponding rules and items

I’ve never had a bug until today.

BUT i’m pretty sure the culprit is the prometheus setup i did, no mention of java there :

2019-10-29 21:00:37.334 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.340 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.350 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.354 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.360 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.365 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.370 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:37.373 [WARN ] [] - Field 'editable' could not be eliminated: Can not set boolean field to null value
2019-10-29 21:00:41.517 [INFO ] [basic.internal.servlet.WebAppServlet] - Stopped Basic UI
2019-10-29 21:00:52.698 [INFO ] [ing.zwave.handler.ZWaveSerialHandler] - Stopped ZWave serial handler
2019-10-29 21:00:53.454 [INFO ] [io.openhabcloud.internal.CloudClient] - Shutting down openHAB Cloud service connection
2019-10-29 21:00:53.503 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = uuid, base URL = http://localhost:8080)
2019-10-29 21:00:53.712 [INFO ] [assic.internal.servlet.WebAppServlet] - Stopped Classic UI
2019-10-29 21:00:53.769 [INFO ] [ui.habmin.internal.servlet.HABminApp] - Stopped HABmin servlet
2019-10-29 21:00:54.942 [INFO ] [arthome.ui.paper.internal.PaperUIApp] - Stopped Paper UI
2019-10-29 21:00:54.979 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - QueuedThreadPool[ESH-httpClient-common]@e0d607{STOPPING,10<=10<=40,i=0,q=6}[org.eclipse.jetty.util.thread.TryExecutor$$Lambda$116/26664676@16b5006] Couldn't stop Thread[ESH-httpClient-common-111,5,main]
2019-10-29 21:00:54.983 [WARN ] [e.jetty.util.thread.QueuedThreadPool] - QueuedThreadPool[ESH-httpClient-common]@e0d607{STOPPING,10<=10<=40,i=0,q=6}[org.eclipse.jetty.util.thread.TryExecutor$$Lambda$116/26664676@16b5006] Couldn't stop Thread[ESH-httpClient-common-117,5,main]
2019-10-29 21:00:55.007 [INFO ] [panel.internal.HABPanelDashboardTile] - Stopped HABPanel
2019-10-29 21:00:55.038 [INFO ] [.dashboard.internal.DashboardService] - Stopped Dashboard

So i went from 8 hours before crash to a couple of days without any problem, I think we can conclude that the nest binding was the problem.

My binding was taking information of 2 smoke Detector with 3 String and 1 Switch item each. Couple of rule that were emailing in case of problem.

I hope this litlle experience will help some people!

Well, I suppose that’s a good news/bad news situation. The good news (if you can call it that) is that the Nest Binding is going to become dead at some point soon anyway as soon as Google decides to turn off the Works with Nest API. The bad news is this means any problems with the binding are unlikely to be looked at and fixed.

I can confirm the issue with nest binding. I did migrate the nest account with google. After that my OH2 started failing with OutOfMemory errors. It could stay running some hours after reboot and then randomly reporting out of heap or metaspace .

"Exception in thread "upnp-main-2" java.lang.OutOfMemoryError: unable to create new native thread"

I did install visualvm to my desktop and connected remotely to OH2 rpi. It showed increasing number of HttpClient threads, over 3700 active threads when stopped responding… Currently without Nest binding my OH2 runs around 150 threads.

Following startup parameters needed in “/etc/default/openhab2” to allow remote visualvm connection.

EXTRA_JAVA_OPTS="-Xms250m -Xmx512m -XX:+UnlockDiagnosticVMOptions -Djava.rmi.server.hostname=ADD_RPI_IP_HERE"