Rules stop executing after +/- 20 hrs

Hi all,

I am running openHabian 2.3 on a RPI3. There are some rules running on it but after more or less a day the rules stop executing. All other functions are unchanged although the sitemap doesn’t update ‘live’ anymore, just when one refreshes the page itself states show their current status.

I did some debugging and found that my awning/rollershutter rule is the troublemaker of them all.

// zon berekening en uitvoering
rule "Auto Zon"

when

        Item zonBerekening received command

then

        val logName = "zon"

        val int Lux_Gemiddelde = 20

        val float Lux_drempel_achter_middel_start = 20000.0
        val float Lux_drempel_achter_hoog_start = 40000.0
        val float Lux_drempel_voor_middel_start = 20000.0
        val float Lux_drempel_voor_hoog_start = 40000.0

        var boolean schaduw_ochtend_hoog_oud = false
        var boolean schaduw_ochtend_middel_oud = false
        var boolean schaduw_middag_hoog_oud = false
        var boolean schaduw_middag_middel_oud = false


        logInfo(logName, "Rule -Auto zon- triggered")
        drempel_achter_middel.postUpdate(Lux_drempel_achter_middel_start)
        drempel_achter_hoog.postUpdate(Lux_drempel_achter_hoog_start)
        drempel_voor_middel.postUpdate(Lux_drempel_voor_middel_start)
        drempel_voor_hoog.postUpdate(Lux_drempel_voor_hoog_start)

        avgLuxAchter.postUpdate(luxAchter.averageSince(now.minusMinutes(Lux_Gemiddelde), "mysql"))
//      avgLuxVoor.postUpdate(luxVoor.averageSince(now.minusMinutes(Lux_Gemiddelde), "mysql"))

        var stand_ochtend = ""
//      var stand_middag = ""

        var Number A = (sunAzi.state as DecimalType)
        var Number E = (sunEle.state as DecimalType)

        var boolean schaduw_ochtend_middel = (A < 150) && (avgLuxAchter.state as DecimalType > drempel_achter_middel.state as DecimalType)
        var boolean schaduw_ochtend_hoog = (A < 150) && (avgLuxAchter.state as DecimalType > drempel_achter_hoog.state as DecimalType)

//      var boolean schaduw_middag_middel = (A > 160) && ((avgLuxVoor.state > drempel_voor_middel.state as DecimalType)
//      var boolean schaduw_middag_hoog = (A > 160) && (avgLuxVoor.state > drempel_voor_hoog.state as DecimalType)

        // Rolluiken achter
        if(schaduw_ochtend_hoog) {
                stand_ochtend = "DOWN"
        }
        else if(schaduw_ochtend_middel) {
                stand_ochtend = "85"
        }
        else {
                stand_ochtend = "UP"
        }
/*
        // Rolluiken Voor
        if(schaduw_middag_hoog) {
                stand_middag = "DOWN"
        }
        else if(schaduw_middag_middel) {
                stand_middag = "85"
        }
        else {
                stand_middag = "UP"
        }
*/
        // Zet zonnescherm
        if((schaduw_ochtend_hoog || schaduw_ochtend_middel) && autoZbw.state < 4) {
                if(autoZbw.state < 3) {
                        if(E < 35 || A < 97) {
                                zonneschermBenedenWoonkamer.sendCommand(DOWN)
                        }
                        else if(A < 118) {
                                zonneschermBenedenWoonkamer.sendCommand(66)
                        }
                        else {
                                zonneschermBenedenWoonkamer.sendCommand(33)
                        }
                }
                else {
                        zonneschermBenedenWoonkamer.sendCommand(DOWN)
                }
        }
        else if(!schaduw_ochtend_hoog && !schaduw_ochtend_middel && (schaduw_ochtend_hoog_oud || schaduw_ochtend_middel_oud) && autoZbw.state == 0) {
                zonneschermBenedenWoonkamer.sendCommand(UP)
        }

        // Zet rolluiken achter
        val autoAchter = gZonAchterAuto.members.filter[ a | a.state < 4 && a.state != 1  ]
        autoAchter.forEach[ a | 

                val rsNachtA = gAchterNacht.members.findFirst[ na | na.name == a.name + "_nacht" ]
                if(rsNachtA.state == OnOffType::OFF) {
                        (a.name + "_shutter").sendCommand(stand_ochtend)
                }
        ]

/*      // Zet rolluiken voor
        val autoVoor = gZonVoorAuto.members.filter[ v | v.state < 4 && v.state != 1  ]
        logInfo(logName, "There are " + autoVoor.size + " shutters voor in auto-mode")

        autoVoor.forEach[ v |
        logInfo(logName, "It is " + v.name + " turn to check for nightmode")

                val rsNachtV = gVoorNacht.members.findFirst[ nv | nv.name == v.name + "_nacht" ]
                if(rsNachtV.state == OnOffType::OFF) {
                        (v.name + "_shutter").sendCommand(stand_middag)
                        logInfo(logName, "Item " + v.name + "_shutter received command")
                }
        ]
*/
        schaduw_ochtend_hoog_oud = schaduw_ochtend_hoog
        schaduw_ochtend_middel_oud = schaduw_ochtend_middel
//      schaduw_middag_hoog_oud = schaduw_middag_hoog   
//      schaduw_middag_middel_oud = schaduw_middag_middel

end

Now, you might wonder why so much is commented out. This is just because my luxsensor for the front of the house (Dutch word: voor) is not in yet so there is no item to connect to.

After slowly releasing parts of my rule I found that this line:

avgLuxAchter.postUpdate(luxAchter.averageSince(now.minusMinutes(Lux_Gemiddelde), "mysql"))

makes the system stop after a while. I saw in another topic that the mysql persistence made someones rule stop so I now changed to jdbc instead. Since this morning it is running with this new persistence so the results aren’t in yet but I was wondering if my rules is fundamentally wrong in the first place?

None of my rule make use of thread::sleep. No error in the log. CPU usage as expected (low).

Best regards

Good job of narrowing it down to the specific line that is causing problems. That is no small feat.

I’ve not seen the posting where others have had problems with MySQL stopping people’s Rules but I don’t use MySQL so may have skipped over it.

I would guess, and it’s only a guess, that the averageSince method call is taking a really long time to return or never returning. If this is the case, the Rule never exits and ties up one of the five available runtime threads. If this happens five times then there are no threads left and your Rules will stop running.

Actually, looking at my logs. 20hrs is excessive. It’s more in the vicinity of 8 or 9. Now this seems to tie up nicely with my mysql.cfg file:

# the database url like 'jdbc:mysql://<host>:<port>/<database>' (without quotes)
url=jdbc:mysql://127.0.0.1:3306/OpenHAB

# the database user
user=openhab

# the database password
password=xxx

# the reconnection counter
reconnectCnt=0

# the connection timeout (in seconds)
waitTimeout=28800

# Use MySQL Server time to store item values (=false) or use openHAB Server time (=true).
# For new installations, its recommend to set "localtime=true".
# (optional, defaults to false)
localtime=true

and then in particular:

# the connection timeout (in seconds)
waitTimeout=28800

This is equal to 8 hours. So why is this set to 28800? I have the number from a different post and it seemed to work. Lower values did cut out on me with an error in the openhab.log file. This value never did. Values from the lux sensor were still logged.

However, when I created the awning/rollershutter rule, this used mysql to extract data. I see a connection but am uncertain what it means. Hope it could help someone else getting stuck and using the same function from mysql as I do.

So far jdbc is still going…

I can’t say for a fact but I had issues with my rules completely stopping without warning at random intervals when I used mysql binding. I changed to jdbc pointing to same mysql DB and rules have worked flawlessly for about a year now. I did go through my rules at the same time and eradicated all but the shortest sleeps as I was chasing down thread locking such as Rich mentions. So, as I say, I can’t be certain which was the culprit.

8 hour waitTimeout? Holy cow. I can’t imagine any situation where it makes sense to wait that long. Maybe they thought it was milliseconds instead of seconds, but almost 29 seconds still seems like a really long time. I can’t imagine anything more than a few seconds being anything more than an error in any home automation context.

I’d be very interested to know of the JDBC binding works better for you. If it does I think there is enough evidence here to file a bug on the MySQL binding.

So far so good although sitemap didn’t really update icons but at least the rules attached to items still execute.

On the mysql tot jdbc part. Found that here in post 85.

On the part regarding that massive timeout figure, this and this topic refers…

Yup, that was my thread that lead to switching to jdbc from mysql after lots of red herrings. I hope it solves your issue!! I can’t tell you how frustrating it was. I actually was actively considering homeseer cause nobody could figure it out and the rules failing was causing major issues and costing money for my house.