RuleEngine stops working after a while

Hello,

actually I have some major problems with the openhab rule-engine:
After a while (ranges from 10min to serveral hours) suddenly none of my rules are getting executed.

I still can see the logging from RuleEngine.execute (line 305: logger.debug("Executing rule '[…]) in my log but they did not get executed and I dont see their logs anymore (“start rule:” - see my example rule at the end).

When the rule excecution is broken I can see in debug that the xtext method call returns an empty list as result but no error message; result.getResult() returns ‘null’, no exception is thown (class: ScriptImpl, method: execute(IEvaand no error message.

I already tried to update from openhab 1.6 to 1.7, but the problem is still the same.

I also noticed that the FolderOberserver most the time is not working anymore when the rule problem occurs.

I think that the problem occurs much faster, if an action triggers multiple rules.
A restart fixes the problem for a while.

My rules looks like this:

import org.openhab.core.library.types.*
import java.util.Date
import java.text.SimpleDateFormat
import org.joda.time.*

rule "Lamp on"

when
Item item_000000001 received update
then
logInfo("start rule:", "Lamp on" )
if(item_000000001.state.toString().equals( "ON" ) ) {
logInfo("execute rule", "Lamp on" )
sendCommand(item_000000002, ON)
}
end

Has anybody have an idea how to solve this?
Thanks!
Tobias

I’ve not seen the Rule engine stop working, but openHAB versions prior to 1.7.0 had series Threading/Concurrency issues IF you had multiple Rules triggering from the same Item event.

See issue 2362 for details.

Versions after 1.7.0 still have issues, but they’re far more isolated in nature, and depend upon the specific Bindings you have (and how they interact with DateTime objects)

Once you’re running 1.7.0 (or newer), you’ll want to paste (formatted) the Exception stacks you’re seeing, along with any Rules that could be triggering them… like whatever’s making result.getResult() calls if that happens to be in one-or-more of your rules.

Hi, thanks for your reply.
I am now constantly testing on 1.7.0.

I can just see, although log level is set to debug, even for xtext:

13:58:51.000 DEBUG o.o.m.r.i.engine.RuleEngine- Executing rule 'Taster - Toggle - Aus'
13:58:51.002 DEBUG o.o.m.r.i.engine.RuleEngine- Executing rule 'Taster - Toggle - An'

instead of:
13:58:51.000 DEBUG o.o.m.r.i.engine.RuleEngine- Executing rule 'Taster - Toggle - Aus'
13:58:51.002 DEBUG o.o.m.r.i.engine.RuleEngine- Executing rule 'Taster - Toggle - An'
13:58:51.008 INFO  o.o.model.script.start rule:- Taster - Toggle - An
13:58:51.008 INFO  o.o.model.script.start rule:- Taster - Toggle - Aus
13:58:51.017 INFO  o.o.model.script.execute rule- Taster - Toggle - An

This are the rule files i try to execute:

rule "Taster - Toggle - An"    
when
    Item item_000000001 received update
then
    logInfo("start rule:", "Taster - Toggle - An" )
    if(item_000000001.state.toString().equals( "ON" )) {
        if(item_000000002.state.toString().equals( "OFF" )) {         
            logInfo("execute rule", "Taster - Toggle - An" )
            sendCommand(item_000000002, ON)
        }
    }
end

and

rule "Taster - Toggle - Aus"

when
    Item item_000000001 received update
then
    logInfo("start rule:", "Taster - Toggle - Aus" )
    if(item_000000001.state.toString().equals( "ON" )) {
        if(item_000000002.state.toString().equals( "ON" )) {         
            logInfo("execute rule", "Taster - Toggle - Aus" )
            sendCommand(item_000000002, OFF)
        }
    }
end

Thanks for your help.

Hi!
Some questions:

  • Which Java Version are you using?
  • What filesystem is the installation on?
  • Are there any backup scripts or something similiar, which touch the rules files?
  • Why do you import joda time there?
  • Maybe, loglevel TRACE would be more verbose?

Maybe one solution would be to write the rules in Python or JavaScript to get other log messages, because with your current rules you reach the XText beast, where logging seems to get very silent. :-/ Look at JSR233 in the openHAB Wiki how to do this. Hopefully, jython or nashorn/rhino will produce more helpful error messages.

please also add some more logging in order to get closer. I would (at least) add trace logging at:

  • SriptExecutionThread before and after L44
  • ScriptImpl before L59
  • ScriptImpl L61
  • ScriptImpl before 67 (log result)

Hi, I will build in the trace logging and post the logs as soon I was able to break the rule engine again. For now, this is what I remember from debugging:

It gets executed, the result is null when the rules are broken

The message is null and we exit with “return result.getResult()” (= null)

if(result==null) never gets true

result.getResult() is always null when broken

So it might be somewhere in xtext line 59:interpreter.evaluate(xExpression, evaluationContext, CancelIndicator.NullImpl):wink:
xExpression and evaluationContext looked quite good, but the structure is too big for me to judge any difference to the correct working ones.

1.8.0u60

ext4

no backups or similar, the folder observer also did not log any file refresh when the rule execution stops

I’ve tried some rules with time based trigger, but I faced the error before with a clean import section as well.

I am going to try TRACE logging now.
Thanks for your help.

I was not able to get any xtext logging

see

which might be related …

I am using JSR233 with Nashorn JavaScript Rules now.
It looks way better.

I will keep you updated.

JSR233 with Nashorn JavaScript Rules now works perfectly fine for me.
Thanks for giving me this advice.

1 Like

I just now had this problem. So I put an old rules file and add my change one by one and… A little mistake into my rules file, a var declaration outside a rule with an uppercase letter, Var. No error in log file of openhab and no rule starting. So you could verify that.