Rules stop executing after a while

Well it took me forever as well. I have just changed crashplan to verify every 7 days at 15:00, so if I have no problems until then, that will validate that this is the culprit.

If this is the case, then it is worth noting that if iowait goes above 40% (at least on my system), rules stop triggering.

Thanks for the help.

1 Like

OK,

2 days now and no issues. I have disabled crashplan altogether. It seems that this was the culprit.

Thanks to all for the help.

1 Like

Nicholas - I am having a similar problem with rules just stopping to work. How do I disable crashplan?

Depends on your OS, I’m using Ubuntu 14.04, so I use:

sudo service crashplan stop

But it depends on your OS version as to whether it uses upstart or not to start services.

Nick Waterton P.Eng.

National Support Leader - Nuclear Medicine, PET and RP

GE Healthcare

M+1 416 859 8545

F +1 905 248 3089

E nick.waterton@med.ge.com

2300 Meadowvale Boulevard

Mississauga, ON L5N 5P9 Canada

GE imagination at work

thank you for the quick reply - crashplan is not running on my PI. time to keep looking :frowning:

Crashplan has ended their home plans so it is unlikely that this would be causing your problems.

Have you reviewed Why have my Rules stopped running? Why Thread::sleep is a bad idea? I know you are using a bunch of locks and long running commands (sendHttp*Request) which can also cause problems.

Just to explain further what I found.

It’s not so much crashplan running that is the issue, it’s when there are a lot of changes in the filesystem. Crashplan scans the filesystem for changes every so often. If there are a lot of changes, then your iowait starts to climb. If the iowait goes over 40% or so, then OH2 rules stop executing.

Way round this is to exclude files that change a lot from crashplan (like .log files). If that doesn’t help, then what I do is stop OH2 for a few hours, and let crashplan run. It will scan the filesystem (may take a while) and start backing up. Once the backing up starts you can restart OH2, and you should be OK (until you have large changes in your file system again)…

This may have something to do with crashplan running at a nice of 19 (ie super low priority), so it’s iorequests start clogging everything up. I have tried changing the nice of crashplan, but doesn’t really seem to help.

Now why OH2 rules stop executing if iowait goes over 40% - I don’t know.

I have set up a rule that stops crashplan if rules stop executing (on a timer – timers still execute), and restarts OH2 if things don’t get going with just stopping crashplan. It’s a band-aid though.

Nick Waterton P.Eng.

National Support Leader - Nuclear Medicine, PET and RP

GE Healthcare

M+1 416 859 8545

F +1 905 248 3089

E nick.waterton@med.ge.com

2300 Meadowvale Boulevard

Mississauga, ON L5N 5P9 Canada

GE imagination at work

Rich,

I do not use an thread sleep commands and do not use HTTP requests in any of the scripts. I do use locks. None of the rules I am using take more than a 10-20 milliseconds to execute (using the log file time entries).

Maybe I’m remembering someone else I’m helping.

Though with locks, let’s say you have 10 events that take place at nearly the same time and the Rule takes 20 msec to run. Then it will take 180 msec for the last event to start processing. And for 100 msec no other Rules can run. If the events keep coming in fast and furiously you can easily hit a starvation situation as the number of Rules waiting to run grows longer and longer. I don’t think how OH behaves in this circumstance is well understood, it might kill the Rules engine entirely.

Also, I don’t think that the order that the Rule processes (i.e. the order that they acquire the lock) is guaranteed.

I get the feeling that doing anything much more than if then else in a rule is asking for problems.

mjcumming

      [Michael Cumming](https://community.openhab.org/u/mjcumming)




    September 27

I get the feeling that doing anything much more than if then else in a rule is asking for problems.


Visit Topic or reply to this email to respond.


In Reply To

rlkoshak

      [Rich Koshak](https://community.openhab.org/u/rlkoshak)

      Foundation member




    September 27

Maybe I’m remembering someone else I’m helping. Though with locks, let’s say you have 10 events that take place at nearly the same time and the Rule takes 20 msec to run. Then it will take 180 msec for the last event to start processing. And for 100 msec no other Rules can run. If the events keep c…


Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

Well, after months of running Ok (with crash plan stopped from time to time), I am now back to the issue of rules simply stopping working. No real pattern, could run for a day, or a few minutes. Now even with crasplan stopped.

None of the suggestions here have helped so far.

Anyone have any idea of how to troubleshoot this? It’s extremely frustrating. My band-aid of having a timer detect when rules stop triggering, and restart OH2 is keeping things limping along.

I have been using OH for a long time, it’s just since 2.2, now 2.3 that this problem has started.

Hey Nicholas,

i am pretty sure that there is some specific rule that causes all the mess.
At least that’s what happened to my system.
Have you tried disabling all rules and enabling them step by step to ifentify the rule that is causing the troubles?

Are all rules not working or only cron related?

I would set up a rule that creates a log entry every minute.
With this rule enabled, enable you first “real” rule and let the system run. Or maybe only enable the rule which is already your suspect.

This is very time consuming but maybe it helps narrowing it down…

It would be nice to know what the threads are doing when everyone’s rules execution stops. Maybe you can provide use with some detailed thread information next time the issue occurs?

To get the thread info you can issue the following command on the Console:

threads --monitors --locks

I’ve had a very similar problem: I’ve created a simple test rule that toggles an test-item every minute between ON and OFF. I also created a second rules that triggers on changes of this test-item and a third rule, that triggers on received updates. After some time OpenHab2 randomly stopped triggering the second and third rules: Item changed triggers and item received update triggers simply stopped triggering without any obvious reason.

After some research I found that the MySQL Persistence was causing this issue. I replaced the MySQL Persistence by MapDB Persistence and RRD4J Persistence using the Design Pattern: Group Based Persistence. Immediately after eliminating MySQL Persistence the problem was gone.

I hope this helps.

Now I’m on this train as well.

The openhab setup with rules and everything has been working fine for over a week now, maybe even two.

And now all of a sudden, yesterday, all rules stopped working - all I see is the dummy item, that’s supposed to trigger a rule, get it’s update but no rule actions…

I have 6 rules, all of which are similar in a sense that they listen to whether a dummy item received a command, and then in turn send several commands (2-4).

I have no timed triggers or anything.

Please pay a new thread and post your rules.

I’ve also run into this dreaded problem too.

It actually came up once long ago on 2.3 (i think?) and I never really resolved it.

We moved to a new house, so I started completely fresh with 2.5 where everything was great for about 5 months. Recently, I upgraded to 2.5 m2 and noticed the rules engine would periodically stop running again. Any new suggestions on how to best debug this?

Thanks!!

@roy_liao

its more efficient to start a new thread as Rich mentioned.

For a first step when you do start a new thread, this needs clarifying. I would guess that there is no actual evidence that the rules engine has “stopped”, but instead that rules that you expected to run, didn’t run?

An outline of why you expected them to run would help. Triggered by time,or by events? The events.log and perhaps openhab.log preceding the problem would be good, but you would need to say what might be missing or unexpected.