Rule engine does not start after openhab restart due to high amount of rules

  • Platform information:
    • Hardware: Raspberry Pi 4 Model B Rev 1.1/4 GB/32GB
    • OS: Raspbian GNU/Linux 11 (bullseye)
    • Java Runtime Environment: 17.0.13
    • openHAB version: 4.3.0
  • Issue of the topic: Rule engine does not start after openhab restart due to high amount of rules
  • Please post configurations (if applicable):
    • Rules code related to the issue:
      • 256 DSL rules in 28 text files
      • All rules are running without issues and errors
      • All rules will be loaded - no warning messages when rules are loaded during openhab service start
      • but the Rule engine will not be started (maybe due to long loading time of rules?) - log entry missing that rule engine is started and rules will not be executed by any trigger
      • Issue appeared the first time after updating from 4.2.X to 4.3.0
      • I am able to avoid the issue of the not-started rule engine by:
        • removing the Rule-text-files before openhab is started
        • => rule engine will be started
        • Import rule-text-files after successful start
        • => rules will be loaded and executed

Does anyone know how I can enable the rule-engine to start automatically again without removing the rule-files before?

Thanks and best regards
Tim

Hi,
for delaying DSLrules at system start I use one of this… Cleaning up the startup process / renaming rules (windows possible)
Greets

1 Like

I can’t think of anything that would cause this that changed between those two versions. There was a PR to compile managed rules during startup to reduce the time it takes for the first run of the rule, but that shouldn’t change fine based rules which, IIRC has always completed the rules at lead time instead of waiting for that first run.

How long do you wait? Rules DSL rules can take a surprisingly long amount of time to load and parse, particularly if you use primitives and otherwise force the type of variables (I can write a single line if code that can require minutes to parse even on an x86 machine). I think there was an upgraded to the underlying Xtend libraries between those versions. Perhaps that includes a regression or exacerbated an already existent limitation.

What’s the CPU doing while waiting for the rules engine to start?

If you reduce the number of rules at which point does the problem appear or dies reach the seen to add a little to the amount if time it takes for the rules engine to start until it doesn’t start at all? Maybe there’s is just one time causing problems?

Thanks for the quick reply!
The first time it was running for some hours. I only noticed that my rules were not executed anymore. I think the rule engine was not started at all.
Afterwards I tried it several times again and waited ~5 min. All Rule-files were loaded successfully and events were received. But the rule engine was not started.
I only tried to remove all rule files - not single ones. Maybe I need to try that …
The CPU load was not really high. Therefore I think the rule engine will never be started when I run into this issue.

@Baschtlwaschtl thanks for the link! It seems to be a good and easy solution.

My system shows the very same behaviour:

openHAB Version:

4.3.1 (Build)

Environment:

Virtualization Station on QNAP
Firmware version: QTS 5.2.2.2950
Intel(R) Celeron(R) CPU J3455, up to 2300 MHz (4 cores, 4 threads)
32 GB (29 GB usable)

Guest-System:
Ubuntu 24.04.1 LTS, memory: 4GiB

49 rules in 21 rules files

It seems that the rule engine is never started when I start/restart my entire system.

If I only restart openHAB, the rule engine is started about every second time without errors.

Exactly as Tim wrote:

“… All rules will be loaded … but the Rule engine will not be started …”

I conclude that this is a race condition bug.

It is possible that the system is still busy with other things during a restart and gives less resources to the start of openHAB.

Renaming the *.rules files before and after the start of openHAB is an (annoying) workaround.

Is it possible to start/restart the rule engine from openhab-cli?

Restarting the Automation bundle might do it. But since the rule engine isn’t actually starting in the first place I doubt that will make a difference. It seems to get stuck before the rukle engine even starts.

In another thread which may be related we might have identified a problem with bogus cron expressions in a rule trigger (e.g. Feb 31st).

I created an issue to merge the three threads where this problem is being reported and hopefully get the attention of a maintainer.

Version: 4.3.2 (Build)

When I restart the automation bundle via the openhab-cli console, I get the message
[ERROR] [e.automation.internal.RuleEngineImpl] - Failed to compile rule '<rulename>' with status 'UNINITIALIZED' for every single one of my rules.

The good news: Afterwards the rules are workimg as expected!

Unfortunately I can not use a rule to restart the Automation bundle :wink:

Maybe you can use a rule in a different language from Rules DSL or a managed rule?

@rlkoshak thanks for creating the issue!

Today I was able to restart Openhab with a RuleEngine starting automatically.
Before I was fine tuning some of my DSL-Rules, which were showing an unkown type/object error, which occurred when I saved changes in another rule file.
I explicitly defined the type of items, which do not have any channel assigned. like:

...
if (my_rule_item.state as OnOffType == ON) {
...

I do not know if this is the root cause. I would need to restart my openhab more often.
But I have the feeling rule-syntax checking is more fussy with 4.3.X…

I just updated to 4.3.2 and openhab started smoothly without any (unnormal) Warning message.
But I just tried to restart once…

Update: I restarted multiple times today and the issue is not solved. I was just lucky after the upgrade yesterday. Only thing, which helps for me is renaming the rules at startup and putting them back when start level 50 is reached. Similar like the guideline in the second post.