[SOLVED] NGRE cron trigger freezes rule initialization

I’ve run into a recurring situation and can’t determine if the problem is what I’m doing or what the rule engine is doing. The following script causes a reproducible failure of the rule engine. When the script is saved, the first rule shows in the event logs as UNINITIALIZED and then moves on to INITIALIZING and then hangs there. This causes a couple of notable downstream effects:

  1. All other rule initialization to freezes. No other rule changes are registered although all previously initialized rules still function normally.
  2. The rule in question shows up in the PaperUI list (as initializing) but is now a zombie rule. It will remain there even after the rule is commented out of the original script and through OH restarts. The only way to get rid of it is to clear the cache.
from core.rules import rule
from core.triggers import when
import os
import personal.utils
reload(personal.utils)
from personal.utils import tSend


@rule("System Health Reboot", description="Reboot OH on a cron job to maintain stability of some features.", tags=["Server"])
@when("Time cron 0 0 3 ? * 1")
def system_health_reboot(event):
	tSend("System health reboot")
	system_health_reboot.log.info("System health reboot")
	os.system("sudo /bin/systemctl restart openhab2.service")


@rule("Netcap Reset", description="Reset the Netcap permissions after system update if required for dash binding.", tags=["Server"])
@when("Item Server_Run_Netcap changed to ON")
def netcap_reset(event):
	os.system("sudo /sbin/setcap cap_net_raw,cap_net_admin=eip $(/usr/bin/realpath /usr/bin/java)")
	tSend("Netcap permissions set.")

I’ve not been able to get any useful log data out of this, clearly I haven’t set the correct module to debug yet, though I have tried several.

I’ve narrowed down the culprit to the cron trigger. The script initializes and works as expected with an item trigger. In a moment of self doubt I even ran the cron expression through a tester and it parses properly.

I have other rules that run on cron triggers, for example @when(“Time cron 0 0 23 * * ?”), and initialize and run just fine. Have I messed up something here that I just don’t see, or does this look like an issue with the rule engine?

You have a rule that restarts the openHAB service? That sounds like a disaster waiting to happen! Do you have the same problem if you comment out one or the other of the rules? It seems like you are somehow killing the Jython interpreter. I have not tested you rule. What is in your personal.utils.tSend module?

core,triggers.when will validate the expression, so if the rule started, it is valid. Do the logs show it starting?

This issue is actually independent of what is inside the rule. But, this same rule has worked very well for years with DSL and I’m only now porting all my old rules over to jython. I have several things/bindings that just get out of sync or contact every so often and have simply found that a regular reboot eases the problems almost completely. I could, of course, just move the rule to crontab outside of OH, but, as I said, it’s been a rule for probably 3+ years and running flawlessly so I saw no reason not to port it over with my other rules. And the system call, I’m 99+% positive is not what’s causing the issue.

The issue is strictly with this one rule. If I comment out the second rule in the script, the issue still occurs. If I comment out any of the lines inside the rule, the issue still occurs. If I comment out the cron trigger and replace it with a different trigger the issue goes away. If the issue occurs then all new rule initializing is halted for this script or any other. All rules already in place still run just fine (including my other cron trigger rules).

Here’s an example from the logs. You can see the rule start to initialize and then no other activity occurs. 15 seconds later a different rule runs just fine.

2020-06-24 17:03:26.643 [thome.event.RuleAddedEvent] - Rule '5034600d-7c8f-476f-a5aa-aa9b178605ee' has been added.
2020-06-24 17:03:26.644 [.event.RuleStatusInfoEvent] - 5034600d-7c8f-476f-a5aa-aa9b178605ee updated: UNINITIALIZED
2020-06-24 17:03:26.644 [.event.RuleStatusInfoEvent] - 5034600d-7c8f-476f-a5aa-aa9b178605ee updated: INITIALIZING
2020-06-24 17:04:07.231 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:03:00.000-0700 to 2020-06-24T17:04:00.000-0700
2020-06-24 17:04:47.809 [vent.ItemStateChangedEvent] - Sensor_SF_MB_TriSenseLum changed from NULL to 10
2020-06-24 17:05:07.814 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:04:00.000-0700 to 2020-06-24T17:05:00.000-0700
2020-06-24 17:05:07.815 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 105.2 to 104.9
2020-06-24 17:06:07.078 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:05:00.000-0700 to 2020-06-24T17:06:00.000-0700
2020-06-24 17:06:07.079 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 104.9 to 105.2
2020-06-24 17:06:26.843 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:5368c88b:node18' has been updated.
2020-06-24 17:06:29.107 [hingStatusInfoChangedEvent] - 'zwave:device:5368c88b:node18' changed from ONLINE: Node initialising: REQUEST_NIF to ONLINE
2020-06-24 17:06:32.416 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:5368c88b:node18' has been updated.
2020-06-24 17:07:06.990 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:06:00.000-0700 to 2020-06-24T17:07:00.000-0700
2020-06-24 17:07:06.991 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 105.2 to 104.9
2020-06-24 17:07:21.861 [vent.ItemStateChangedEvent] - astro_moon_local_phase_agePercent changed from 12.63879200855104 % to 12.65058073186834 %
2020-06-24 17:07:21.868 [vent.ItemStateChangedEvent] - astro_moon_local_position_azimuth changed from 231.0884925488944 ° to 233.2235849388844 °
2020-06-24 17:07:31.152 [vent.ItemStateChangedEvent] - Wthr_H_Temp changed from 110.93 to 110.61
2020-06-24 17:07:31.153 [vent.ItemStateChangedEvent] - Wthr_H_TempMin_1 changed from 78.49 to 77.92
2020-06-24 17:07:31.153 [vent.ItemStateChangedEvent] - Wthr_H_TempMax_0 changed from 111.65 to 111.52
2020-06-24 17:07:50.426 [vent.ItemStateChangedEvent] - Sensor_SF_MB_TriSenseTemp changed from NULL to 83.4
2020-06-24 17:08:10.203 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:07:00.000-0700 to 2020-06-24T17:08:00.000-0700
2020-06-24 17:08:10.209 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 104.9 to 104.7
2020-06-24 17:08:30.232 [vent.ItemStateChangedEvent] - SE11400_CurrentYear changed from 10310.506 kWh to 10311.934 kWh
2020-06-24 17:08:30.234 [vent.ItemStateChangedEvent] - SE11400_CurrentDay changed from 33.732 kWh to 35.16 kWh
2020-06-24 17:08:30.238 [GroupItemStateChangedEvent] - SolarData changed from 8.317 to 8.295 through SE11400_CurrentPower
2020-06-24 17:08:30.246 [vent.ItemStateChangedEvent] - SE11400_CurrentPower changed from 8.317 kW to 8.295 kW
2020-06-24 17:08:30.248 [vent.ItemStateChangedEvent] - solaredge_generic_se11400_aggregate_month_production changed from 1668303.0 Wh to 1669731.0 Wh
2020-06-24 17:08:30.249 [vent.ItemStateChangedEvent] - SE11400_CurrentMonth changed from 1668.303 kWh to 1669.731 kWh
2020-06-24 17:09:07.576 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:08:00.000-0700 to 2020-06-24T17:09:00.000-0700
2020-06-24 17:09:07.583 [vent.ItemStateChangedEvent] - ABW_Indoor_Temp changed from 79.5 to 79.7
2020-06-24 17:09:07.587 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 104.7 to 104.9
2020-06-24 17:10:07.011 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:09:00.000-0700 to 2020-06-24T17:10:00.000-0700
2020-06-24 17:10:07.018 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 104.9 to 104.7
2020-06-24 17:11:07.247 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:10:00.000-0700 to 2020-06-24T17:11:00.000-0700
2020-06-24 17:12:06.972 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:11:00.000-0700 to 2020-06-24T17:12:00.000-0700
2020-06-24 17:12:06.972 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 104.7 to 104.5
2020-06-24 17:12:21.878 [vent.ItemStateChangedEvent] - astro_moon_local_phase_agePercent changed from 12.65058073186834 % to 12.662369848156182 %
2020-06-24 17:12:21.888 [vent.ItemStateChangedEvent] - astro_moon_local_position_azimuth changed from 233.2235849388844 ° to 235.23333062999183 °
2020-06-24 17:13:08.955 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:12:00.000-0700 to 2020-06-24T17:13:00.000-0700
2020-06-24 17:13:55.073 [vent.ItemStateChangedEvent] - Sensor_SF_MB_TriSenseTemp changed from 83.4 to 84.2
2020-06-24 17:14:04.048 [me.event.ThingUpdatedEvent] - Thing 'zwave:device:5368c88b:node24' has been updated.
2020-06-24 17:14:04.205 [hingStatusInfoChangedEvent] - 'zwave:device:5368c88b:node24' changed from ONLINE: Node initialising: REQUEST_NIF to ONLINE
2020-06-24 17:14:06.915 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:13:00.000-0700 to 2020-06-24T17:14:00.000-0700
2020-06-24 17:14:06.919 [vent.ItemStateChangedEvent] - ABW_Indoor_Hum changed from 44.0 to 45.0
2020-06-24 17:15:08.099 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:14:00.000-0700 to 2020-06-24T17:15:00.000-0700
2020-06-24 17:15:08.100 [vent.ItemStateChangedEvent] - ABW_Outdoor_Temp changed from 104.5 to 104.3
2020-06-24 17:16:07.169 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:15:00.000-0700 to 2020-06-24T17:16:00.000-0700
2020-06-24 17:16:07.170 [vent.ItemStateChangedEvent] - ABW_Indoor_Temp changed from 79.7 to 79.8
2020-06-24 17:16:58.206 [vent.ItemStateChangedEvent] - Sensor_SF_MB_TriSenseTemp changed from 84.2 to 84.5
2020-06-24 17:17:07.344 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:16:00.000-0700 to 2020-06-24T17:17:00.000-0700
2020-06-24 17:17:21.859 [vent.ItemStateChangedEvent] - astro_moon_local_phase_agePercent changed from 12.662369848156182 % to 12.674158885849916 %
2020-06-24 17:17:21.861 [vent.ItemStateChangedEvent] - astro_moon_local_position_azimuth changed from 235.23333062999183 ° to 237.12754625812957 °
2020-06-24 17:17:31.150 [vent.ItemStateChangedEvent] - Wthr_H_Temp changed from 110.61 to 110.37
2020-06-24 17:17:31.150 [vent.ItemStateChangedEvent] - Wthr_H_Cond_Icon_Map changed from clear-day to partly-cloudy-day
2020-06-24 17:17:31.151 [vent.ItemStateChangedEvent] - Wthr_H_Cond changed from Clear to Partly Cloudy
2020-06-24 17:17:31.158 [vent.ItemStateChangedEvent] - Wthr_H_Cond_Icon changed from clear-day to partly-cloudy-day
2020-06-24 17:18:07.173 [vent.ItemStateChangedEvent] - ABW_Indoor_Obs changed from 2020-06-24T17:17:00.000-0700 to 2020-06-24T17:18:00.000-0700
2020-06-24 17:18:55.081 [GroupItemStateChangedEvent] - Portals changed from CLOSED to OPEN through Sensor_MF_K_BackDoor
2020-06-24 17:18:55.082 [vent.ItemStateChangedEvent] - Sensor_MF_K_BackDoor changed from CLOSED to OPEN
2020-06-24 17:18:55.083 [.event.RuleStatusInfoEvent] - 11af0b24-c11a-47bf-a225-0b29276a6546 updated: RUNNING
2020-06-24 17:18:55.099 [.event.RuleStatusInfoEvent] - 11af0b24-c11a-47bf-a225-0b29276a6546 updated: IDLE

While the rule does show up in the PaperUI list (with the indicator that it is initializing) it, not surprisingly doesn’t yet exist in any other context so I can’t access it through the api.

tSend is just a simple wrapper to my telegram actions.

That’s just it. The rule never even gets a chance to start. It never finishes initializing; it never reports as idle; it just freezes at the initializing step. If the cron trigger for this rule is present, the process stops there. The second rule never even gets to the initializing step (and therefore, also ceases to exist) because the whole initialization process freezes at rule 1.

It sounds like the scheduler is choking on the Trigger. What if you use this instead?

@when("Time cron 0 0 3 ? * SUN")

I’ll do some digging later today.

1 Like

Yep, that seems to isolate the problem. SUN appears to work where 1 does not when parsing the cron expression.

With all the checking and testing I did, I’m a little embarrassed I didn’t do that one.

I can reproduce this and have created an issue in GH…

This is not an issue with Jython or the helper libraries.@hilbrand, I pinged you in the issue, but doing it again for good measure :wink:.

Thanks Scott.

1 Like

Yep, it’s a bug. Made possible by the world because it could not decide if Sunday is the first or last day of the week. I’ve created a fix, but it will be not be available until openHAB 3. Until then one should use SUN for Sunday instead of the number 1.

The problem also happens with cron expressions like second Sunday of the Month: 2#1 use: 2#SUN.

1 Like