OH3 Help with "OutOfMemoryError: Java heap space" needed

Hi all,

I installed a OH3 on Openhabian, after adding some Things, Bindings and Rules I experience “java.lang.OutOfMemoryError: Java heap space”-errors every 6-7h, the process is unresponsive at this point.
Unfortunately I am not experienced enough to locate the leak in OpenHAB and need some help of you tracing down the problem.

Here is a screenshot of the heap analyser.

  • Platform information:
    • Hardware: RasPi 4, 4GB Ram
    • OS: Openhabian
    • Java Runtime Environment: zulu11.43.88-ca-jdk11.0.9-linux_aarch32hf
    • openHAB version: Openhab 3.0.0.M4 (Build)

I also have a lot of warnings like this:
2020-12-01 01:38:24.905 [WARN ] [ab.core.internal.events.EventHandler] - Dispatching event to subscriber ‘org.openhab.core.internal.items.ItemUpdater@a99dc6’ takes more than 5000ms.

Thank you!

1 Like

Since I can only insert one screenshot per post, here is the second:

Please post your rules as I suspect things are wrong there.

Or try to disable all rules and enable one by one day after day

I found the source of the memory overflow. The rule checks the state of my washing machine and runs every minute. It just consists of some if cases, average calculations (over 5-10 numbers) and a bit of casting.
How can it be that this simple script causes an OutOfMemoryError?
The last “if” with the telegram command is never executed since the state is not reached, so it has to be in the first if cases.

triggers:
  - id: "1"
    configuration:
      cronExpression: 0 * * * * ? *
    type: timer.GenericCronTrigger
conditions: []
actions:
  - inputs: {}
    id: "2"
    configuration:
      type: application/vnd.openhab.dsl.rule
      script: >-
        if (WaschmaschineStatus.state.toString == "NULL" && (WaschmaschinenPlug_Strom.averageSince(now.minusMinutes(5)) as Number)<0.05){
          WaschmaschineStatus.sendCommand("INAKTIV")
        }else if(WaschmaschineStatus.state.toString == "NULL" && (WaschmaschinenPlug_Strom.averageSince(now.minusMinutes(5)) as Number)>0.05){
          WaschmaschineStatus.sendCommand("AKTIV")
        }
          
        if (WaschmaschineStatus.state.toString == "INAKTIV" && (WaschmaschinenPlug_Strom.averageSince(now.minusMinutes(1)) as Number)>0.05){
          WaschmaschineStatus.sendCommand("AKTIV")
          logInfo("WaMa","active set to 1,  status is: "+WaschmaschineStatus)
        }


        if (WaschmaschineStatus.state.toString == "AKTIV" && (WaschmaschinenPlug_Strom.averageSince(now.minusMinutes(1)) as Number)<0.05){
          WaschmaschineStatus.sendCommand("INAKTIV")
          logInfo("WaMa","active set to 0, status is: "+WaschmaschineStatus)
          val telegramAction = getActions("telegram","telegram:telegramBot:12345678")
          telegramAction.sendTelegram("Waschmaschine fertig")
        }
    type: script.ScriptAction

That’s just coincidence. Someone has to make the step over the limit first.
You can increase the limit (EXTRA_JAVA_OPTS in /etc/default/openhab2) but that’s usually not the problem. Better check for mem leaks in bindings by disabling them, one by one.

Maybe the averageSince in your rule is the source of leak?
You are even calling it several times in the same rule rather than saving the value in a variable. In case of leak, you enforce the leak,

The only binding that is used is the MQTT-binding. I think for the history needed inside “averageSince” the (InfluxDB) persistence is used?
The MQTT binding works fine for other rules.
I’m pretty sure that increasing the limit won’t help here, it just would take longer until the error occurs.

Yes you are right, I could reduce the calls in this case, but I need to adjust the time average for better filtering of the different states.

Try to not use averageSince to see if your leak problem is fixed.

@takeoo111111 did you ever find the problem? I am have this same issue, but I do not use MQTT or average since.

Unfortunately I’m still searching for the problem. It sometimes takes days for the out-of-memory-error to occur, so the testing is quiet a pain.
At the moment I’m running a rule with an empty DSL script in the “Then” statement every 2 minutes. Yesterday the system crashed again but I did not had the time to look into it and just restarted my Pi.
However I have a second rule with a “every 5 min”-cron task and a script, either the root cause is related to the interval or something completely different since I don’t have problems with this one (at least not within a couple of days).

From the heap analyzer I can see that there is a Java-Map that has an enormous amount of entries, but I couldn’t find it the the code.

@takeoo111111 have you tried setting the initial heap size to something bigger, e.g. EXTRA_JAVA_OPTS="-Xms350m"?

On my RPi3 I was getting constant complaints from Alexa about devices not responding. This was due to items taking longer than 200ms to update when using the Hue Emulator. I had a guess that the delay might have been due to garbage collection kicking in, since upping the size of the initial heap the response time is now less than 50ms.

On your screenshots the heap size looks to be ~250mb, have you seen it grow any bigger than that?

I’ve had a similar experience with my Dad’s environment as he has been on 3.0.0-1.
Upgraded to snapshot 3.1.0~S2130-1 and, rules unchanged, don’t cause any side effects. :partying_face: