Cron triggered rules are executed twice

rossko57 · September 13, 2019, 7:08pm

I think you should raise a Github issue for this, as it is repeatable. Need some expert advice on how/what to trace.

juelicher · September 13, 2019, 7:33pm

I would open an issue, but I am still not sure in which repository. The bug occurs only with Jython rules, but after a quick lock in the Jython library sources I am quite sure, that it is not in these sources.

Any advice would be appreciated!

5iver · September 13, 2019, 7:34pm

It would be an issue for openhab-core, but if you’ll wait, I’ll take a look. I have not been able to reproduce this. I am wrapping up some major releases and should be done by this weekend.

juelicher · September 13, 2019, 9:56pm

Thanks for your effort! If you need further details please let me know, especially because not all of my cron triggers are showing this behavior.

As a workaround I have written a small function, that checks the time since the rule was triggered last, so that is nor very pressing for me.

Andrew_Rowe · September 14, 2019, 3:41am

Thanks Scott for all you contribute!

hilbrand · September 14, 2019, 8:32am

DSL uses quartz, jython uses new rule engine Cron trigger. Based on your examples it looks like the step part in the Cron expression could have a problem. Because all the ones with the step have the problem, the other not. I did some tests and from that it looks the calculation of the Cron expression is correct. Also for the steps. This seems also logical given the behavior. Because how it works is. When a Cron task is finished the next iteration is scheduled. It does this by calculating in how many seconds to schedule the next call. For some reason either this calculation goes wrong and it’s scheduled to early or the scheduler itself does something wrong. So because of this the task is run to earlier. But then it calculates the next task and that will be the time it should have actually run. Which results in the task run twice. The core schedule part doesn’t contain much logging options, so this doesn’t give us any options there. It could be the number of threads available for the scheduler is to low in this case. But this is just a wild guess. So one option might be to increase the number of threads. I don’t know out of my head where to configure this, but it might be worth a try. I’ll also will have a second look at the scheduler code itself, to see if I can find anything suspicious.

5iver · September 14, 2019, 3:02pm

Is there a possibility that your system clock is being updated? What hardware are you using?

juelicher · September 14, 2019, 4:42pm

I am using a PC Engines APU4 Board with the AMD GX-412TC CPU and Debian 10, with all updates installed. The system has been rebootet several times after I first noticed the bug

The system time is synced via a ntp server. I thought about that, but it seems very unlikly, that this causes the same step of several seconds every day at the same time.

The error occured after updating from 2.4 release to 2.5M2, bevore that it never occured.

BigRedBrent · September 30, 2019, 4:22am

I moved my code over to Jython and have this problem as well. I have rules set to run once a day at midnight, once a week on Monday at midnight, and once a month on the first at midnight.

They are all triggering roughly 15 to 30 seconds before midnight and then triggering again at midnight.

I have created a script to ignore the first one if it is not more than 12 hours away from midnight, and this has allowed me to use the correct one. But it is very odd that this is needed when it was not before I moved away from DSL to Jython.

from datetime import datetime

def MillisecondsUntilMidnight():
    n = datetime.now()
    return 86400000 - ((n.hour * 3600000) + (n.minute * 60000) + (n.second * 1000) + (n.microsecond / 1000))

@rule("Daily Cron")
@when("Time cron 0 0 0 ? * * *")
def DailyCron(event):
    if MillisecondsUntilMidnight() < 43200000: return
    # do something here

I am running it on a QNAP TS-251+ with 8GB RAM:

I am using openhab 2.4.

I very much like coding in Jython though. It is so very much better than coding in Xtend.

rossko57 · October 1, 2019, 10:32am

That does seem to be another confirmation that something is up with NGRE scheduler.

It might be helpful if you can narrow down more precisely the range of the “earliness” trigger. @juelicher is getting a fairly consistent 5 seconds - if you get something of a different order, it could be a clue.

From what @hilbrand says, the new scheduler is completely different, which I did not realize.

BigRedBrent · October 1, 2019, 11:26am

I have been logging the times when they execute, so that shouldn’t be an issue for me to get the times. The earliness does not seem to be consistent for me though. Although I suspect the longer interval schedules may trigger earlier then those with smaller intervals. But I’ll have to look at them again to be sure.

When I get around to it, I’ll post the times that they have been triggering.

BigRedBrent · October 2, 2019, 3:02am

Here are the numbers that I have gathered so far.

Daily: Time cron 0 0 0 ? * * *
2019-09-29 23:59:43.547
2019-09-30 00:00:00.017
2019-09-30 23:59:24.621
2019-10-01 00:00:00.014
2019-10-01 23:59:43.731
2019-10-02 00:00:00.011
2019-10-02 23:59:24.621
2019-10-03 00:00:00.109

Weekly: Time cron 0 0 0 ? * MON *
2019-09-29 23:59:43.551
2019-09-30 00:00:00.015

Monthly: Time cron 0 0 0 1 * ?
2019-09-30 23:59:08.145
2019-10-01 00:00:00.020

That is 16, 35, 16 and 16 seconds early for daily, 16 seconds early for weekly, and over 51 seconds early for monthly. The time stamps I am taking look like they are accurate within 200 milliseconds.

It never seems to not run early, at least with the three triggers I have set up.

rossko57 · October 2, 2019, 3:30pm

I’d recommend opening a github issue, you’ve got clear data and it seems repeatable.

pinkyf · July 5, 2020, 1:19pm

Hello,
are there any news meanwhile?

I’m running version 2.5.0 and get occasionally triggers executed twice as well. An example ist here, the trigger was created during system start and I did not edit any rule later:

/var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.433 [DEBUG] [jsr223.jython.core.triggers         ] - when: Created cron_trigger: [Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b]
/var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.453 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job '0 5 9 * * ? *' for trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b'.
/var/log/openhab2/openhab.log.4:2020-06-29 09:05:00.056 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log.4:2020-06-30 09:04:59.631 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log.4:2020-06-30 09:05:00.656 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log.4:2020-07-01 09:04:59.789 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log.4:2020-07-01 09:05:00.802 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log:2020-07-02 09:05:00.384 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log:2020-07-03 09:04:59.963 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log:2020-07-03 09:05:00.972 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log:2020-07-04 09:05:00.364 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log:2020-07-05 09:04:59.739 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.
/var/log/openhab2/openhab.log:2020-07-05 09:05:00.751 [DEBUG] [e.automation.internal.RuleEngineImpl] - The trigger 'Time_cron_0_5_9_693df540b92811ea9fc3029389483b0b_6946f5f0b92811eaa808029389483b0b' of rule '82a3f1b3-1b08-4fd5-8600-7fa4bb5eae44' is triggered.

Kind regards,
Frank

5iver · July 5, 2020, 2:14pm

I’m not aware of anything and I have never been able to reproduce this. I’m curious to see the ntpd log of someone experiencing this in order to see how often and how much the system clock is being updated, and compare that with the times that the rule was triggered in the OH log. I have not tested while changing the system clock.

I’m also curious what your other cron triggers are and how many there are.

rossko57 · July 5, 2020, 2:45pm

The oddity that stands out with all these reports is the premature triggers. Just how premature seems to vary by host.

pinkyf · July 5, 2020, 4:40pm

Hello Scott, thanks for your response.

I have a ntpd running, but it runs synchronized and there are no time adjustement steps.

Here you see my list cron triggers. (BTW: Is there a console command to retrieve them from a running system?)
/var/log/openhab2/openhab.log.4:2020-06-28 12:06:29.498 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 5 7-18/4 * * ? *’ for trigger ‘Time_cron_0_5_7-18_4_090a6240b92711ea8eb /var/log/openhab2/openhab.log.4:2020-06-28 12:06:29.499 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘20 5 0 * * ? *’ for trigger ‘Time_cron_20_5_0_0909ed0fb92711ea89230293894 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:17.315 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 30 8 * * ? *’ for trigger ‘Time_cron_0_30_8_67625770b92811eaad050293894 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:17.499 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 5 0 * * ? *’ for trigger ‘Time_cron_0_5_0_677b0f8fb92811eab40a029389483 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:17.971 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘30 0 12 * * ? *’ for trigger ‘Time_cron_30_0_12_67ca680fb92811eab7e402938 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:17.973 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘30 0 0 * * ? *’ for trigger ‘Time_cron_30_0_0_67c9f2e1b92811ea8c260293894 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:18.196 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 0 7 * * ? *’ for trigger ‘Time_cron_0_0_7_67ef070fb92811ea9672029389483 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:18.197 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 15 8 * * ? *’ for trigger ‘Time_cron_0_15_8_67ee91e1b92811eabc1f0293894 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:18.341 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 15 21 * * ? *’ for trigger ‘Time_cron_0_15_21_68052721b92811ea8ee802938 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:19.234 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 20 22 * 1-4,10-12 ? *’ for trigger ‘Time_cron_0_20_22_1-4_10-12_688b470 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:19.242 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 30 23 * 5-9 ? *’ for trigger ‘Time_cron_0_30_23_5-9_688af8eeb92811eab30 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:19.580 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 30 8 * * ? *’ for trigger ‘Time_cron_0_30_8_68c2f8deb92811eaba970293894 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.325 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 5 7-18/4 * * ? *’ for trigger ‘Time_cron_0_5_7-18_4_6933bc11b92811ea80f /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.327 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘20 5 0 * * ? *’ for trigger ‘Time_cron_20_5_0_693346e1b92811ea97b30293894 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.448 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 45 18 * * ? *’ for trigger ‘Time_cron_0_45_18_69445de1b92811eab6ea02938 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.449 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 00 16 * * ? *’ for trigger ‘Time_cron_0_00_16_6943c1a1b92811eaa02a02938 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.450 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 30 13 * * ? *’ for trigger ‘Time_cron_0_30_13_6942620fb92811ea909a02938 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.452 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 15 11 * * ? *’ for trigger ‘Time_cron_0_15_11_6941ece1b92811ea9c6f02938 /var/log/openhab2/openhab.log.4:2020-06-28 12:16:20.453 [DEBUG] [le.handler.GenericCronTriggerHandler] - Scheduled cron job ‘0 5 9 * * ? *’ for trigger ‘Time_cron_0_5_9_693df540b92811ea9fc3029389483 9029389483b0b_090a894fb92711eaad13029389483b0b’.
83b0b_090ab05eb92711ea9196029389483b0b’.
83b0b_67627e80b92811ea9391029389483b0b’.
b0b_677ee01eb92811eaa0fb029389483b0b’.
9483b0b_67ca8f1eb92811eaab6c029389483b0b’.
83b0b_67cb2b5eb92811ea9040029389483b0b’.
b0b_67f03f8fb92811ea8307029389483b0b’.
83b0b_67f08db0b92811ea926c029389483b0b’.
9483b0b_68052722b92811ea899c029389483b0b’.
fb92811eaa445029389483b0b_688b6e1eb92811ea967e029389483b0b’.
c029389483b0b_688be34fb92811ea812d029389483b0b’.
83b0b_68c2f8dfb92811eab6ed029389483b0b’.
1029389483b0b_6933e31eb92811ea9f66029389483b0b’.
83b0b_69343140b92811eab7bb029389483b0b’.
9483b0b_69445de2b92811eab32e029389483b0b’.
9483b0b_6945212eb92811ea8c38029389483b0b’.
9483b0b_6945965eb92811ea8741029389483b0b’.
9483b0b_694659b0b92811eab927029389483b0b’.
b0b_6946f5f0b92811eaa808029389483b0b’.

May be I should write a report script to detect such duplicate timer executions. The mentioned cases were immediately visible, because the rule fails in such cases.

5iver · July 5, 2020, 10:07pm

Please use code fences to make things readable.

Are you saying that you have ntpd configured to use localhost as the time source(!)… or that you looked into the log and there were no adjustments to the system time?

No, but you can get them with a script …

from core.log import logging, LOG_PREFIX
LOG = logging.getLogger("{}.TEST".format(LOG_PREFIX))

for rule in rules.getAll():
    cron_triggers = [trigger.configuration.properties["cronExpression"] for trigger in rule.triggers if trigger.configuration.containsKey("cronExpression")]
    if cron_triggers != []:
        LOG.warn("rule: '{}', cron triggers: \n{}".format(rule.name, "\n".join(cron_triggers)))

cweitkamp · July 6, 2020, 9:43am

Hi all,

I think I observed a similar problem in my environment. Some of my rules are triggered twice. I tried to wrap-up my last steps I did before and I have the slight feeling that some triggers / rules were not removed when I modified my script and it could not be parsed because of errors.

In general I see the following log if parsing the script was successful:

2020-07-06 11:29:12.458 [INFO ] [ort.shared.ScriptedAutomationManager] - removeAll added handlers
2020-07-06 11:29:12.564 [INFO ] [me.core.service.AbstractWatchService] - Loading script 'python/personal/kodi.py'
2020-07-06 11:29:23.782 [INFO ] [ort.shared.ScriptedAutomationManager] - removeAll added handlers
2020-07-06 11:29:23.843 [INFO ] [me.core.service.AbstractWatchService] - Loading script 'python/personal/kodi.py'

Whenever the parsing failed I saw - ignore the error itself - if got it sorted:

2020-07-06 11:25:16.540 [INFO ] [me.core.service.AbstractWatchService] - Loading script 'python/personal/kodi.py'
2020-07-06 11:25:25.575 [ERROR] [ipt.internal.ScriptEngineManagerImpl] - Error during evaluation of script 'file:/etc/openhab2/automation/jsr223/python/personal/kodi.py': TypeError: Error when calling the metaclass bases
    cannot create 'NoneType' instances in <script> at line number 3
2020-07-06 11:25:26.593 [INFO ] [me.core.service.AbstractWatchService] - Loading script 'python/personal/kodi.py'
2020-07-06 11:25:35.588 [ERROR] [ipt.internal.ScriptEngineManagerImpl] - Error during evaluation of script 'file:/etc/openhab2/automation/jsr223/python/personal/kodi.py': TypeError: Error when calling the metaclass bases
    cannot create 'NoneType' instances in <script> at line number 3
2020-07-06 11:26:40.914 [INFO ] [me.core.service.AbstractWatchService] - Loading script 'python/personal/kodi.py'
2020-07-06 11:26:52.048 [INFO ] [ort.shared.ScriptedAutomationManager] - removeAll added handlers
2020-07-06 11:26:52.075 [INFO ] [me.core.service.AbstractWatchService] - Loading script 'python/personal/kodi.py'

If you have a closer look at it you will see that in the first example two time "removeAll added handlers" was called whereas the second example containes only one "removeAll added handlers". Maybe a hint where we have to start digging.

5iver · July 6, 2020, 10:12am

I’m not sure this issue is related, since it is reportedly occurring after a fresh startup…

Depending on what you are doing, the scriptLoaded and scriptUnloaded functions may help, but errors in scripts can definitely cause problems. The worst case is when a script hangs the Jython interpretter, requiring an OH restart.