Sometimes rules don't get executed?

Thanks for your help! I was able to get rid of the errors when it runs, by removing the System started trigger from the my other rules, where that error was referring to.
I have added a switch to it, if I trigger it with it, it runs well, without errors and calculates the right time of day. I will see if the removed ‘System started’ trigger makes sense…

With the removed System started trigger, you will have to wait until the next scheduled time trigger for vTimeOfDay to be updated when OH restarts.

If the errors only occur during the System started trigger, then the errors are almost certainly caused by the previously mentioned bug in OH.

It seems that removing the System started trigger did the trick. Now it is working as normal…

Unfortunately I’m back again with this problem. However it is now clear that this rule is not only not executed triggered by astro events, but by cron events also. Sometimes it works as it should sometimes not. Now I’m on 2.4 M4 (the latest Milestone build) and I have experienced not only this, but (which was always there in my setup):

  • Randomly one or more rules don’t get executed. Moreover not only don’t get executed but the event won’t trigger my rule. I have some rule which has a logInfo in the beginning of the rule and when an event should trigger the rule I can only see in the events.log that the corresponding trigger Item state changed, but not the log which the rule should produce
  • I have increased the execution threads to 7 so now I always have at least 3-4 thread which is available.
  • When these ‘rules don’t get executed’ errors happen I can’t see any change on hardware side (CPU load is around 5-10%, memory is not full, everything works as before).
  • Moreover I even don’t have to restart openHab or RPi, just wait a couple of hours and everything is back to normal. The logs also don’t produce any significant error or any unfamiliar thing for me.

Both Astro and cron triggered Rules are triggered from a separate thread pool.

Do you know for sure this is the case? See the post I replied with above, particularly post 12. Look at the quartz thread pool to see if you are using those up. You cannot have more than two cron triggered Rules run at the same time.

Also, do you see “Scheduled Astro event-jobs for thing astro:sun:home” for each of your Astro Things? If not then the events are not even being scheduled which points to a problem with the Astro binding.

Look to see if all your RuleEngine threads are running too.

That might be one problem. I have another rule which is triggered by cron exactly at 11pm. I will change that and see if it helps

Dear @rlkoshak!

I have tried simplyfing my rules and removing any unnecessary cron triggers from rules. Now my system mostly depends on the Time Of Day rule if the rule is based on day-states.

However I still have the original problem. My rule which turns on and off the LEDs (I have the same rule as in the original post), sometimes don’t get executed. I have made sure that there is enough rule execution thread is available all-time, so it might be some other problem. Funny though that I can see the log produced by the rule along all day, but not at night… (where this rule really works…)

That rule triggers on changed i.e. OFF to ON. It won’t trigger if you get repeated updates of ON if it is already ON. Confident you are actually getting changes logged for the Item?

Yes. I can see in the log that ‘Zone4 changed from OFF to ON’. If the rule works I can immediately see the log. If not, the ‘Zone4 changed from OFF to ON’ still there, but there are no other logs.

Okay, its a simple trigger and you’re confident about trigger event actually happening.
Nothing weird about Item naming.
First line is a logInfo. All good stuff so far.

Suggestion: change title of your rule (non-obviously, these must be system-wide unique) in case there is a duplicate cached.
Suggestion: add another rule triggered by Item update to ON and see how that behaves

Thanks! I have tried it. Renamed my rule and add another rule (to the same file, maybe try adding it to a new file?)
However I can’t see neither of the two rule running (no logging). Plus now I can’t even see any of my rule producing logs…
One thing to add (I don’t know when it happens or it has anything related to it…) I have these switches in a group (OR) , and that group is displayed in a sitemap… However sometimes the group label says that it is OFF, but there are items in it which is ON…

Edit2: for example for the Time Of Day rule (day_state) I have a dummy switch for debugging which can trigger this rule. If I press that switch the rule gets triggered (I can see it in the logs), but not other rules running now… So it seems that the triggers which has to happen ‘automatically’ are not triggering the rules.

What are you using to trigger these other Rules? If you are triggering them using changed but you manually trigger the ToD Rule then those Rules won’t trigger because day_state isn’t changing. Make sure that the Rules are using the appropriate trigger.

Over all I don’t have any ideas. If you are not out of Rule threads and the Item is changing states then the Rule should trigger.

I have around 40 rules, which is triggered by mostly Item changes, so it should work at any time…
Any good method to track available/used rule execution threads?

I think you miss my point. Rules triggered by changes will only trigger when the Item changes state.

If you trigger your time of day Rule by hand, the day_state will not change. Thus those Rules that trigger on changes to day_state will not trigger.

See post 12 by 5iver in the link I posted above.

I see your point, but for eg.: mobile phone’s location changes perodically and I have some rules which is triggered by that change. Sometimes all of these rules stops working, since it has logging and it won’t update the info…
Secondly I have rules which are triggered for eg.: light switches. They also stop working at these times…

Okay, that’s clear and we’re wasting time concentrating on this one rule.

How-to monitor thread usage is here somewhere

Yes and sorry for that, I’m also just found that this happens… hard to spot these errors. So for me (not sure) seems that sometimes all rules stops working and after an hour or so (not sure…) all rules starts to work again… no reboot or restart is required to work again.
Thanks, I’ll try to set-up thread monitor and maybe that helps us to investigate this error.

Oh yes. “Everything stopping” is not at all as clear-cut as it sounds, in an event driven system !

You might add a little rule that logs at ten-minute intervals or suchlike, to try and see what events lead up to seizure.

Yes this is what is strange to me also. No errors or any other sign of problem, just like the ruleEngine stops and everything else works as expected (bindings, etc…)

I will make a rule and try to monitor the threads

Thanks!

So I had a little time to play around with this. I have been watching the ruleEngine with this command:

shell:threads --list |grep "RuleEngine" |wc -l

If I’m right it should return how many threads are used by the rule engine, right? It is really unpredictable. Sometimes none of my rules are running and it returns 6 (which is 5 running threads if you distract the 1 from this caused by this call).

I have also set-up a test rule which runs every 10 minutes. For example, now the status is the following:

  • The test rule is working, every 10 minute, it produces some log.
  • The command above returns only 1 thread is used.
  • My other rules don’t want to work…

After 5 mins:

  • It seems that somewhere the execution is stuck, because all of the rules are executed together (around 100 log entry in my events.log that my rules ran around 20 times) which should ran in the previous 10 minutes. So it seems that it “knew” that the rule should run, but unable to execute them and now executes together.

For example the rule which I inserted in the first post, ran together, which should ran when it detects motion in one of my room:

2018-10-13 20:03:26.006 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.379 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.480 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.490 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.492 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.566 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.888 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.931 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:26.954 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:30.285 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

2018-10-13 20:03:30.402 [INFO ] [smarthome.model.script.wifiled.rules] - Night Light rule started!

…and many more…

After this “catching up” everything works as it should… and it happens after a random period…

Hope it is understandable what I’m trying to say.