Rule engine does not start after openhab restart due to high amount of rules

  • Platform information:
    • Hardware: Raspberry Pi 4 Model B Rev 1.1/4 GB/32GB
    • OS: Raspbian GNU/Linux 11 (bullseye)
    • Java Runtime Environment: 17.0.13
    • openHAB version: 4.3.0
  • Issue of the topic: Rule engine does not start after openhab restart due to high amount of rules
  • Please post configurations (if applicable):
    • Rules code related to the issue:
      • 256 DSL rules in 28 text files
      • All rules are running without issues and errors
      • All rules will be loaded - no warning messages when rules are loaded during openhab service start
      • but the Rule engine will not be started (maybe due to long loading time of rules?) - log entry missing that rule engine is started and rules will not be executed by any trigger
      • Issue appeared the first time after updating from 4.2.X to 4.3.0
      • I am able to avoid the issue of the not-started rule engine by:
        • removing the Rule-text-files before openhab is started
        • => rule engine will be started
        • Import rule-text-files after successful start
        • => rules will be loaded and executed

Does anyone know how I can enable the rule-engine to start automatically again without removing the rule-files before?

Thanks and best regards
Tim

Hi,
for delaying DSLrules at system start I use one of this… Cleaning up the startup process / renaming rules (windows possible)
Greets

1 Like

I can’t think of anything that would cause this that changed between those two versions. There was a PR to compile managed rules during startup to reduce the time it takes for the first run of the rule, but that shouldn’t change fine based rules which, IIRC has always completed the rules at lead time instead of waiting for that first run.

How long do you wait? Rules DSL rules can take a surprisingly long amount of time to load and parse, particularly if you use primitives and otherwise force the type of variables (I can write a single line if code that can require minutes to parse even on an x86 machine). I think there was an upgraded to the underlying Xtend libraries between those versions. Perhaps that includes a regression or exacerbated an already existent limitation.

What’s the CPU doing while waiting for the rules engine to start?

If you reduce the number of rules at which point does the problem appear or dies reach the seen to add a little to the amount if time it takes for the rules engine to start until it doesn’t start at all? Maybe there’s is just one time causing problems?

Thanks for the quick reply!
The first time it was running for some hours. I only noticed that my rules were not executed anymore. I think the rule engine was not started at all.
Afterwards I tried it several times again and waited ~5 min. All Rule-files were loaded successfully and events were received. But the rule engine was not started.
I only tried to remove all rule files - not single ones. Maybe I need to try that …
The CPU load was not really high. Therefore I think the rule engine will never be started when I run into this issue.

@Baschtlwaschtl thanks for the link! It seems to be a good and easy solution.

My system shows the very same behaviour:

openHAB Version:

4.3.1 (Build)

Environment:

Virtualization Station on QNAP
Firmware version: QTS 5.2.2.2950
Intel(R) Celeron(R) CPU J3455, up to 2300 MHz (4 cores, 4 threads)
32 GB (29 GB usable)

Guest-System:
Ubuntu 24.04.1 LTS, memory: 4GiB

49 rules in 21 rules files

It seems that the rule engine is never started when I start/restart my entire system.

If I only restart openHAB, the rule engine is started about every second time without errors.

Exactly as Tim wrote:

β€œβ€¦ All rules will be loaded … but the Rule engine will not be started …”

I conclude that this is a race condition bug.

It is possible that the system is still busy with other things during a restart and gives less resources to the start of openHAB.

Renaming the *.rules files before and after the start of openHAB is an (annoying) workaround.

Is it possible to start/restart the rule engine from openhab-cli?

Restarting the Automation bundle might do it. But since the rule engine isn’t actually starting in the first place I doubt that will make a difference. It seems to get stuck before the rukle engine even starts.

In another thread which may be related we might have identified a problem with bogus cron expressions in a rule trigger (e.g. Feb 31st).

I created an issue to merge the three threads where this problem is being reported and hopefully get the attention of a maintainer.

Version: 4.3.2 (Build)

When I restart the automation bundle via the openhab-cli console (bundle:restart "openHAB Core :: Bundles :: Automation"), I get the message
[ERROR] [e.automation.internal.RuleEngineImpl] - Failed to compile rule '<rulename>' with status 'UNINITIALIZED' in openhab.log for every single one of my rules.

The good news: Afterwards the rules are workimg as expected!

Unfortunately I can not use a rule to restart the Automation bundle :wink:

Maybe you can use a rule in a different language from Rules DSL or a managed rule?

@rlkoshak thanks for creating the issue!

Today I was able to restart Openhab with a RuleEngine starting automatically.
Before I was fine tuning some of my DSL-Rules, which were showing an unkown type/object error, which occurred when I saved changes in another rule file.
I explicitly defined the type of items, which do not have any channel assigned. like:

...
if (my_rule_item.state as OnOffType == ON) {
...

I do not know if this is the root cause. I would need to restart my openhab more often.
But I have the feeling rule-syntax checking is more fussy with 4.3.X…

I just updated to 4.3.2 and openhab started smoothly without any (unnormal) Warning message.
But I just tried to restart once…

Update: I restarted multiple times today and the issue is not solved. I was just lucky after the upgrade yesterday. Only thing, which helps for me is renaming the rules at startup and putting them back when start level 50 is reached. Similar like the guideline in the second post.

Same issue here, just updated from 4.2.1 to 4.3.2.

Thanks for this suggestion.
I have written a rule in javascript and wanted to restart the automation bundle with this rule.
I used as trigger:

triggers: [ 
		triggers.SystemStartlevelTrigger(20),
		triggers.SystemStartlevelTrigger(40),
		triggers.SystemStartlevelTrigger(50),
		triggers.SystemStartlevelTrigger(70),
		triggers.SystemStartlevelTrigger(80),
		triggers.SystemStartlevelTrigger(100),
	]

Unfortunately the rule does not fire when openHAB is started.

If I restart the automation bundle via openhab-cli console after starting openHAB, the javascript rule is fired, but then I no longer need it because the DSL rules work again after restarting the automation bundle as described above.

I.e. I can not overcome the issue with another rule.

That is not really doing anything. Rules didn’t know if an item is linked it not. I just know that an Item has a state.

The rule engine doesn’t start until 40 so anything below that is meaningless. You can’t ruin a rule before the rule engine is able to run rules.

Also, as far as I remember, there is no run level 80. Are you seeing any errors in the logs? Two invalid triggers I imagine would cause issues to be reported.

Also, you should really only have the one trigger. It will be very disruptive to restart a bundle at each runlevel.

Also, I don’t know if restarting the rule engine at runlevel 40 interferes with the later runlevels and if the run level 20 causes a rule to start before runlevel 40 is actually reached which can mess things up.

Just trigger the rule the one time at runlevel 70. That will make sure the rule engine is fully up before you try to restart it and it’s late enough that it shouldn’t interfere with anything.

If I trigger a DSL rule when start level e.g. 70 is reached it is somehow triggered again with the next start levels. Also if I modify and save the rule-file it will also be triggered again. So I think system reached start level means:
Current level >= start level X

I am happy now loading most of the rule-files directly after the rule engine started. Even if the problem is solved I think I will keep that.

That’s exactly what it means.

In the first go I just wanted to figured out at what happens (regarding the rules) at what startlevel, what is the status of the rules and whether they are detected at all.
I did not implement a restart of the automation bundle yet. (And in the meanwhile I now know that the restart by the rule is useless because the rule is not triggered.)

startlevels according to /usr/share/openhab/runtime/services.cfg:

# Start level definitions
startlevel:20=dsl:items,managed:item,dsl:things,managed:thing,managed:itemchannellink,dsl:persist,managed:metadata
startlevel:30=persistence:services,persistence:restore,automation:scriptEngineFactories
startlevel:40=dsl:rules,managed:rule,rules:refresh,rules:dslprovider
startlevel:50=ruleengine:start
startlevel:70=dsl:sitemap
startlevel:80=things:handler

My javascript rule ( I like it anyway ;-), perhaps the rule will be of interest in other scenarios:


rules.JSRule({
    name: "Rule SystemStartlevelTrigger",
    triggers: [ 
		triggers.SystemStartlevelTrigger(20),
		triggers.SystemStartlevelTrigger(40),
		triggers.SystemStartlevelTrigger(50),
		triggers.SystemStartlevelTrigger(70),
		triggers.SystemStartlevelTrigger(80),
		triggers.SystemStartlevelTrigger(100),
	],
    
  execute: function(event) {
    console.log(event);
    
    // https://www.openhab.org/javadoc/latest/org/openhab/core/automation/ruleregistry
    const { automationManager, ruleRegistry } = require('@runtime/RuleSupport');
    const RuleManager = osgi.getService('org.openhab.core.automation.RuleManager');

    const allRules = ruleRegistry.getAll()
    const content = [];
    allRules.forEach(function(rule) {
        content.push(
            [   rule.getUID()
              , rule.getName()
              , rule.getActions()[0].getConfiguration().getProperties().type ?? "undefined"
              , RuleManager.isEnabled(rule.getUID()).toString()
              , RuleManager.getStatus(rule.getUID())
            ]
        )
    });

    content.sort( function(a, b) {
        const nameA = a[0].toUpperCase(); // to avoid case while sort
        const nameB = b[0].toUpperCase();
        if(nameA > nameB)
            return 1;
        else if(nameB > nameA)
            return -1;
        else
            return 0;     
    })

    /*
        user@ubuntu24-qnap:/etc/openhab/automation/js/node_modules/colors/lib/system$ diff supports-colors.js.org supports-colors.js 
        28a29,32
        >   // Failed to execute rule Rule-SystemStartlevelTrigger-ef725908-5555-410b-9bf6-c4d8acb5b527: TypeError: undefined has no such function "indexOf": TypeError: undefined has no such function "indexOf"
        > 	// at exports (/etc/openhab/automation/js/node_modules/colors/lib/system/supports-colors.js:29)
        >   return false; // colors do not matter for me. "return false" because of exception above.
        >   
    */

    const Table = require('cli-table');
    const table = new Table({
          head:      ['#',     'id',   'name', 'type', 'enabled', 'status']
        , colWidths: [ 6,       64,     80,     34,     9,         15]
        , colAligns: ['right', 'left', 'left', 'left', 'left',    'left']
    });

    let i=0
    content.forEach( function(row) {
        row.unshift(++i)
        table.push(row)
    })

    console.log("\n"+table.toString());
  }
});

On restart of openHAB it does not fire att all.
After manual restart of the automation bundle it get


{
  "raw": {},
  "eventClass": "org.openhab.core.events.system.StartlevelEvent",
  "payload": {
    "startlevel": 40
  }
}
2025-02-09 10:17:44.116 [INFO ] [.automation.script.file.test_rule.js] - 
β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚    # β”‚ id                                                             β”‚ name                                                                           β”‚ type                             β”‚ enabled β”‚ status        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚    1 β”‚ 86d89b4908                                                     β”‚ Test jruby                                                                     β”‚ application/x-ruby               β”‚ true    β”‚ IDLE          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚    2 β”‚ aquaClean-1                                                    β”‚ user is sitting                                                                β”‚ application/vnd.openhab.dsl.rule β”‚ true    β”‚ IDLE          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
| ...  | ...                                                            | ...                                                                            | ...                              | ...     | ...           |
β”œβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   39 β”‚ windbreak-1                                                    β”‚ update knx sensor (switch button)                                              β”‚ application/vnd.openhab.dsl.rule β”‚ true    β”‚ IDLE          β”‚
β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The rule is currently free of side effects. It doesn’t help, but it doesn’t hurt - the result is nice to look at.

Interesting, never checked that out before.
Wouldn’t it make sense to differentiate and eventually reorder that a little?
*.things should be there before *.items refer to their channels, shouldn’t they?
So why not 20->21 (or 30) for *.items?
and why is ruleengine:start AFTER anything-rules ?

I have not made any changes to this file.
See https://github.com/openhab/openhab-distro/blob/4.3.2/launch/app/runtime/services.cfg

Give it a try, then.
You should be able to create some file x.cfg in /etc/openhab/services/ and put your reordered set of start level definitions there.
Note I’m not sure if it changes anything at all about the actual order on startup or if it just changes determination of when a specific startlevel is considered to be reached.