Rules stop being triggered

This is OH3.2 running on Linux in a vm (to allow for more RAM and CPU power than a raspi 3B can offer).
The install is filebased (.things, .items, etc) and the rules are jython jsr223. Sensors and actuators are connected via MQTT.

After OH starts everything works nicely and just like expected. After a number of days things may get wiered. The symptom is that some behaviours of my system stall:

Some lights in the bathroom won’t turn on, but not all are affected. The rules for pir1 and pir2 are ignored, I can see the mqtt message, it gets read and parsed and an item changes, I see the event with those commands and the change event in the event.log. But the rule doesnt seem to receive the event.

And at the same time the pir4 - which is very similar, turns on another light and has it’s rule in the same scriptfile - still works as it should.

When that happens a number of rules in different script files typically are affected. I couldn’t find a common property among them though. A lot of breadcrumping that usually goes into openhab.log stops - but again, not all of it.

When I now edit and save a script file I won’t get the "Loading script ‘filePath’ " line in the .log file. And I don’t see the changes I made being applied.

RAM usage looks normal (1GB) and java is in the single digits when I look at top.

Basically, I can do and try what I want, the only measure that sets things right again (including the changes I just made ) is to restart openHAB. Done in a couple of minutes but this is not satisfying.

I don’t know what is happening and, worse, I have no idea where to look for pointers. I guess I should enable debug for some parts but which? how? and can I direct those debug outputs into a dedicated file so they do not spill over the openhab.log?

Thanks in advance for any idea or pointer!

I don’t have a whole lot of suggestions except to ask whether these rules are long running or not. Each rule get’s its own thread but that means that a given rule can only have one instance of itself running at a time. Subsequent triggers will be queued and worked off in turn, but if for some reason that rule gets stuck and blocks forever, that rule will be disabled forever.

So one of the first things to try would be to add enter and exit log statements to the rules. Make sure they are exiting. That will at least tell us whether or not the rule is blocked and never returned or something else is going on.

I’m not sure what to increase the logging on. You can change the level of openhab.event.RuleStatusInfoEvent and openhab.event.RuleAddedEvent to DEBUG which will add rule events to your events.log. That can be a good way to correlate events and rules running.

The file loading issue might be related to something that was recently merged that changes how files are loaded for the JS Scripting addon.

I filed an issue. [automation] ScriptFileWatcher breaks Jython reloading on change · Issue #2727 · openhab/openhab-core · GitHub

Thanks for writing so fast, Rich!

I do have ‘breadcrumping’ in those rules and it does look like they don’t get called, most of them are quite short. Just not in a stringent entry/exit fashion, may be worth to add that.

2 of them are longish though, and on one occasion I saw them remain in the “Running” state when I checked on ui. That could fit to your description. But they sit in a different scriptfile and are triggered by completly unrelated events than the other affected scripts.

I will try with those DEBUG .
Again, thanks for your help

If they are long running and being spammed it would only impact that one rule, not other rules.