Sorry, I forgot to mention that I know that article and I don’t have a single thread::sleep in these rules and there are rarely any timers at all. Whenever possible I use the expire binding.
All other rules are still working. Only the scheduled ones suddenly stop.
how many cron triggered rules do you run? How often do they trigger? Give us some use case ideas here… what are you doing (with cron triggered rules) Are they triggering every few moments or at some specific time of the day?
Final question: are these rules that have always worked up until now or are they new rules?
if you notice within 45 minutes that they are not working then I’m guess they fire often
I’ve already done that. They simply don’t trigger anymore.
9
Depends… from once a week up to every 60 seconds.
Heating
rule "heating hobby high"
when
Time cron "0 0 6 ? * MON,TUE,WED,THU,FRI *"
then
sendCommand(HobbyRoomThermostatTemp, 21)
end
Vacation Presence Simulation (only 3 items in the group)
var int presence_days = 7
var String persistence = "influxdb"
rule "PresenceSimulation"
when
Time cron "0 0/1 * * * ?" // Every Minute
then
if (VacationMode.state == ON) {
gPresenceSimulation.members.forEach(light,i |
if(light.historicState(now.minusDays(presence_days), persistence).state != light.state) {
logInfo("Pres_Sim",light.name + " switches to " + light.historicState(now.minusDays(presence_days), persistence).state)
sendTelegram("bot2", light.name + " switches to " + light.historicState(now.minusDays(presence_days), persistence).state)
light.sendCommand(light.historicState(now.minusDays(presence_days), persistence).state.toString)
} else {
logInfo("Pres_Sim",light.name + " is already " + light.historicState(now.minusDays(presence_days), persistence).state)
}
)
}
end
That’s a good question!
I noticed the problem when I implemented the last rule, the presence simulation.
It could be that the problem occurred before, but now it is noticed because I need the rule. We are on vacation at the moment and the lights are off
(ref rule logs) The general idea is to check that every preceding call of a rule has an exit. We know you don’t get triggers, but this is looking for rules that hang up before the point of disaster, consuming threads.
It can get terribly confusing to inspect though, especially if you may have multiple starts of a given rule. Although we wouldn’t expect multiple starts of a rule triggered only on cron.
It is possible to add some smarts using global variables as counters, if you have suspects.
var number myRuleStarts = 0
rule "my rule"
...
then
myRuleStarts = myRuleStarts +1
if (myRuleStarts >1) { logInfo("myrule", "WARNING - multiple starts of my rule " + myRuleStarts.toString) }
...
// real rule body
...
myRuleStarts = myRuleStarts - 1
logInfo("my rule", "Exit with code " + myRuleStarts.toString)
end
Exec and HTTP actions included in rules are prime suspects for hangups. Got any?
I’m eyeing the sendTelegram action in the shown rule with suspicion at the moment. That should I’m sure have a default timeout, but I can’t find any reference for that. I think you are still using v1.x action, not v2.x binding? You may have run into a snag there.
If you are confident this is not the area, then the next likely is a binding with scheduled activities (that’s most of them!). At the moment you’re the only complainant, so it would make sense to look at more unusual bindings or the way you have configured them.
Bit mysterious. I’m not surprised there is an exec involved, but you seem to have a correct timeout and no reason to hang up.
Thinks: some other process gobbling up schedule threads could stop the timeout working? Surely not - if there were no threads we could not start the rule in the first place.
If you can verify with a simple rule like the above that the timeout is indeed being ignored please file an issue on openHAB-core (I think that’s where executeCommandLine lives). That would definitely be a bug that needs to be fixed. Though I suspect a fix won’t occur until OH 3.
In the mean time, assuming that it is indeed this rule that is consuming the threads (you have pretty convincing evidence in my opinion) I would add logging (i.e. echo commands) to the airctrl.sh script and log out what executeCommandLine returns so you can figure out which line in the script it’s hanging on and apply a timeout to that command in the script itself. See Put a Timer on Running Commands With Timeout Command in Linux.
It’s good practice to do this but ultimately it’s a workaround and executeCommandLine should be fixed to use the passed in timeout.
If I read the cron expression correctly, this means that the command line is taking longer than five minutes to complete and based on the behavior, it might just be hanging forever.
Yep, if some other process were gobbling up the threads the problem would appear starting up the thread in the first place. executeCommandLine with a timeout argument is blocking so it doesn’t consume a separate thread by itself. It’s running in the same thread as the rule that called it.
Just for info … I experimented in OH 2.5.0 (Windows environment), and cannot fault the Action.
Using executeCommandLine with a wget and duff URL, so that it loops retrying -
Reliably interrupts the script at the given integer timeout mS.
It works with or without retrieving the results e.g. var results = executeCommandLine(...
If I fix up the wget to work, it returns when complete before the given timeout expires.
All seems well for me? Is it possible your script is somehow special, un-interruptable, because it is in a shell?
There was a relevant-to-bash looking change in Jan 2020, I don’t know what OH version that would be released in.