Sudden extreme instability of runtime after increasing the update rate of a multisensor

Hi all,

I’m running 2.0.0-SNAPSHOT build #503 on Windows 10.

Yesterday I increased the update rate of one of my Aeon Multisensor 6-sensors by reducing the temperature treshold parameter. The idea was to get a tighter temperature control and smoother charts (rrd4j).

Afterwards I’ve seen extreme instability with the runtime crashing after a few hours, apparently due to some memory leak. I put some println’s in an if-statement in my thermostat-rule, and it looks like the rule instance is becoming multiplied after some time and that it’s executing several instances of the same rule at the same time.

I hope someone can point me in the direction of sensible logging (I haven’t been able to figure that out in OH2 yet), but pending that I’m pasting in some output from the console that shows what I’m describing. It corresponds to the following:

if (Sensor1_Temp.state < (Double::parseDouble(temp1_setp.state.toString) + hysteresis)) {
        println("Heater on")
        println(temp1_setp.state)
        println(Sensor1_Temp.state)
        sendCommand(Heater1,ON)
        sendCommand(Heater2,ON)
    }
Heater on
21.0
20.39999999999999857891452847979962825775146484375
Heater on
21.0
20.39999999999999857891452847979962825775146484375
Heater on
21.0
20.39999999999999857891452847979962825775146484375
Heater on
21.0
20.39999999999999857891452847979962825775146484375
Heater on
21.0
20.39999999999999857891452847979962825775146484375
Heater on
Heater on
21.0
21.0
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375

fast forward

20.39999999999999857891452847979962825775146484375
Heater on
Heater on
Heater on
Heater on
Heater on
21.0
Heater on
20.39999999999999857891452847979962825775146484375
21.0
20.39999999999999857891452847979962825775146484375
21.0
21.0
21.0
21.0
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
20.39999999999999857891452847979962825775146484375
java.lang.NullPointerException
        at java.net.URI$Parser.parse(URI.java:3023)
        at java.net.URI.<init>(URI.java:595)
        at java.net.URI.create(URI.java:857)
        at org.jupnp.transport.impl.ServletUpnpStream.readRequestMessage(ServletUpnpStream.java:101)
        at org.jupnp.transport.impl.ServletUpnpStream.run(ServletUpnpStream.java:61)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Exception in thread "upnp-main-764" java.lang.NullPointerException
        at java.net.URI$Parser.parse(URI.java:3023)
        at java.net.URI.<init>(URI.java:595)
        at java.net.URI.create(URI.java:857)
        at org.jupnp.transport.impl.ServletUpnpStream.readRequestMessage(ServletUpnpStream.java:101)
        at org.jupnp.transport.impl.ServletUpnpStream.run(ServletUpnpStream.java:61)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
java.lang.IllegalStateException: AsyncContext completed

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "SocketListener(JmDNS-IP-3)"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "upnp-async-2044"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "upnp-async-2049"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "upnp-async-queue"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "upnp-async-2052"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "upnp-async-2032"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "JmDNS(JmDNS-IP-1).Timer"


Perhaps sharing the suspect rule would be helpful

Sure thing! :slight_smile:

This rule fires every time any of the sensors give an update, which now happens much more often, perhaps on the order of every 2 to 3 seconds. The above-mentioned println’s are in their respective places, as can be seen.

rule "Heater cycle"
when
    Item Sensor1_Temp received update or
    Item Sensor2_Temp received update or
    Item Sensor3_Temp received update or
    Item temp1_setp received command or
    Item temp2_setp received command or
    Item temp3_setp received command
then
    var Double hysteresis = 0.25
    
    if (Sensor1_Temp.state > (Double::parseDouble(temp1_setp.state.toString) + hysteresis)) {
        println("Heater off")
        println(temp1_setp.state)
        println(Sensor1_Temp.state)
        sendCommand(Heater1,OFF)
        sendCommand(Heater2,OFF)         
    }
    if (Sensor1_Temp.state < (Double::parseDouble(temp1_setp.state.toString) + hysteresis)) {
        println("Heater on")
        println(temp1_setp.state)
        println(Sensor1_Temp.state)
        sendCommand(Heater1,ON)
        sendCommand(Heater2,ON)
    }
    if (Sensor2_Temp.state > (Double::parseDouble(temp2_setp.state.toString) + hysteresis)) {
        sendCommand(Heater3,OFF) 
    }
    if (Sensor2_Temp.state < (Double::parseDouble(temp2_setp.state.toString) + hysteresis)) {
        sendCommand(Heater3,ON)
    }
    if (Sensor3_Temp.state > (Double::parseDouble(temp3_setp.state.toString) + hysteresis)) {
        sendCommand(Heater4,OFF) 
    }
    if (Sensor3_Temp.state < (Double::parseDouble(temp3_setp.state.toString) + hysteresis)) {
        sendCommand(Heater4,ON)
    }            
end

I can spot a potential race condition where the value of item states may change between one if-condition and its opposite partner. Perhpas worth capturing the state into a variable before processing.
It might pay to convert to an if-else structure. As a bonus, that will slightly improve performance as not all conditions will be checked at every pass.

I cannot see any reason for spawning multiple threads though. There is a diagnostic trick to add a simple counter with a global variable. Increment at start of rule, report value somewhere within, decrement at end of rule. You will soon confirm if parallel copies are running. I guess that would lead to considering the item’s ‘update’ mechanism for multiple updates. Although multiple rule copies in progress shouldn’t be a bad thing of itself?

Bear in mind if one sensor updates frequently. the other parts of the rule will also act on stale data from the other sensors regardless. Might not be what you want, could be worth breaking into three separate rules.

Thanks for the tips! I will implement them tonight and see what I get out of the counter.

Good evening!

I’ve reworked my thermostat rule according to your suggestions and implemented the handy global counter to see if multiple rule instances are being spawned. And indeed they seem to be!

First, here is my reworked rule for the bedroom:

rule "Bedroom heater cycle"
when
    Item Sensor2_Temp received update or
    Item temp2_setp received command
then
    
    debugCounter2 = debugCounter2 + 1
    var Double currentTemp = Double::parseDouble(Sensor2_Temp.state.toString)
    var Double currentSetp = Double::parseDouble(temp2_setp.state.toString)
    
    if (Math.abs(currentTemp-prevTemp2)>5) {
        alarmCounter = alarmCounter + 1
    }
    else {
        if (alarmCounter > 0) {
            alarmCounter = alarmCounter - 1
        }
        println("Bedroom")
        println(debugCounter2)
        if (currentTemp > (currentSetp + hysteresis)) {
            sendCommand(Heater3,OFF)
        }
        else if(currentTemp < (currentSetp - hysteresis) {
            sendCommand(Heater3,ON)
        }
    }
    var Double prevTemp2 = currentTemp
    debugCounter2 = debugCounter2 - 1
end

Pasting in some lines from the console showing two instances running:

Bedroom
2
Bedroom
2
Bedroom
2
Bedroom
2
Bedroom
2
Bedroom
2
Bedroom
2

The counter seems to be sensitive to amongst other things the bootup though. In this case the counter started out at 2, but earlier it started at 1 and reached 3 and 4 in the middle of a log:display-print (unsure if calling log:display might have triggered the additional spawning) before going back to 1.

Excerpt from this morning:

Bedroom
3
Bedroom
3
Bedroom
3
Bedroom
Bedroom
Bedroom
5
5
5
Bedroom
4
Bedroom
4
Bedroom
4
Bedroom
4

Something is clearly going on here that is causing multiple rule executions at the same time. The number of instances seem to grow until the memory overflows. I assume the other rules are being doubled as well, but I will investigate that further.

And before it crashed completely:

Bedroom
13
Bedroom
13
Bedroom
13
Bedroom
13
Bedroom
Bedroom
14
14
Bedroom
14
Bedroom
15
Bedroom
15
Bedroom
Bedroom
15
15
Bedroom
15
Bedroom
Bedroom
16
16
Bedroom
15
Bedroom

Doesn’t seem to be any good reason for that. Perhaps the rule is crashing before exiting tidily. Confident the actual trigger is only appearing every few seconds? (see OH log for update)

Hmm, what’s that prevTemp2 declaration at the end of the rule, after it has already been used? I don’t think that does what you want, you should declare it globally outside the rule if you want it to persist between runs.