Sporadic delayed rule starting: Thread pool too small?

My openHAB instance (Raspberry Pi 3+), running openhabian produces some sporadic extreme delay in rule execution.

Some data:

  • 120GB SSD
  • fan added, cpu-temp is below 50°
  • cpu usage is around 20%, io wait is about 1%.

It is mostly annyoing with the motion detectors, if the light turns on 2 minutes after i walked by the detector it is quite useless. Sometimes it works instantly (that’s how i enjoy it), sometimes a delay of a few to 20-something seconds and sometimes even minutes (this annoys me).

So i read about the threadpool (just 5 concurrent threads :open_mouth: ). Thinking about it, this could be some reason.

Motion detection itself triggers 4 rules:

  • Light->On
  • Messaging
  • ActivityCounter
  • MotionActivityCounter

Combining them is a no-go, because these simple specialized rules (3-liner each, one if, no loop or lambda) each are shared by several subsets of other items. I already increased the threadpool to 10, but that didn’t help a bit. Question is, is the threadpool the reason for delay?

BTW, which is the correct one, the description differs? How can i check the number of threads is currently used?

org.eclipse.smarthome.threadpool:RuleEngine=10
org.eclipse.smarthome.threadpool:ruleEngine=10

How can i see log entries of the threadpool? I already tried:

log:set DEBUG org.eclipse.smarthome.ThreadPoolManager 
log:set DEBUG org.eclipse.smarthome.core.common.ThreadPoolManager
log:set DEBUG smarthome.core.common.ThreadPoolManager

Which logger do i need to adjust to see any message of the Threadpool? Where do i find any documentation about this?

Thanks for any hint.

Looks like i found at least a logger:

2018-09-27 14:45:28.350 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'discovery' of size 5
2018-09-27 14:45:30.448 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'file-processing' of size 5
2018-09-27 14:45:31.683 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'ConfigStatusService' with size 5
2018-09-27 14:45:32.308 [DEBUG] [rthome.core.common.ThreadPoolManager] - Updated scheduled thread pool 'discovery' to size 10
2018-09-27 14:45:32.464 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'safeCall' with size 20
2018-09-27 14:48:01.558 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'lsp' with size 5
2018-09-27 14:48:02.836 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'RuleEngine' of size 5
2018-09-27 14:48:23.916 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'thingManager' of size 5
2018-09-27 14:48:23.971 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'thingHandler' of size 15
2018-09-27 14:48:24.402 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'items' with size 5
2018-09-27 14:48:28.222 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'thingLinkManager' of size 5
2018-09-27 14:48:36.675 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'upnp-io' of size 5
2018-09-27 14:48:38.735 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created scheduled thread pool 'homematicGateway' of size 5
2018-09-27 14:48:39.313 [DEBUG] [rthome.core.common.ThreadPoolManager] - Created thread pool 'homematicRpc' with size 5

Question is, how do i adjust the thread pool for the rule engine? From the log entries it should be the variant with the uppercase ‘R’.

runtime.cfg can be changed during runtime, it’s at least reread. But the RuleEngine-Threadpool-property is ignored. :frowning:

Almost certainly not if your description is accurate. Of course, this is also assuming you don’t have other Rules in your system that are consuming threads for extended periods of time. But if you have ANY Rules that take a long time to run then it will impact all your Rules.

You probably have to restart OH for that field to be taken into account. That property is probably only processed during OH startup when it initially creates the thread pool.

You can easily check to see how many of the thread pools are being used from the command line. See post 12 on Why have my Rules stopped running? Why Thread::sleep is a bad idea.

Be sure to look at events.log and verify that the events are occurring in a timely fashion. If not then the problem is with the binding, not Rules.

What kind of accuracy are you interested in? That one motion sensor command alone triggers 4 rule executions? That’s accurate. You can be sure many other things are also happening.

  • About 30 temperature sensors are reporting their temperature and calculating a difference about every 7 minutes, so about every 14 seconds another thread is used.
  • shortly after the frist motion is detected, a second and a third motion detector should trigger:another 8 threads. All short, but it sums up.

I did. As i said, i see the ThreadPoolManager pool create message, but whatever value i put in, it is ignored. It always creates the pool with 5.

A short check showed 4 threads was maximum, but it is always at least 1 thread running. If the grepping by “rule” is correct.

Yes they are. The state change events are nicely synchronized, it’s just the rule which doesn’t start accordingly.

Still, the question is, how to enlarge the rule engine thread pool.

Everything else that is going on also matters. But based on what you describe, it seems unlikely that running out of Threads is your problem. The Rules simply don’t run long enough or often enough to cause a 20 second delay unless you have a feedback loop or long running commands that haven’t been mentioned yet.

All I know is that parameter in runtime.cfg.

Here is another posting taking about increasing the pool.

@5iver, do you have any ideas?

When changing the runtime.cfg based on this info… Cron based rules suddenly stop executing, I see an immediate change using the Karaf command in that post. I’ve used these settings for about a year, and my server has never had an issue with load…

openhab> config:list "(service.pid=org.eclipse.smarthome.threadpool)"
----------------------------------------------------------------
Pid:            org.eclipse.smarthome.threadpool
BundleLocation: null
Properties:
   discovery = 55
   ruleEngine = 55
   safeCall = 55
   service.pid = org.eclipse.smarthome.threadpool
   thingHandler = 55

But no matter how many threads I added to the pool, I always only saw 5 threads.

Before running rules:

openhab> shell:threads --list |grep "RuleEngine"
82224 │ pipe-grep "RuleEngine"                                                                         │ RUNNABLE      │ 56       │ 40

After running rules:

openhab> shell:threads --list |grep "RuleEngine"
82223 │ pipe-grep "RuleEngine"                                                                         │ TIMED_WAITING │ 59       │ 40
82249 │ ESH-RuleEngine-26                                                                              │ TIMED_WAITING │ 1        │ 0
82250 │ ESH-RuleEngine-27                                                                              │ TIMED_WAITING │ 1        │ 0
82251 │ ESH-RuleEngine-28                                                                              │ TIMED_WAITING │ 1        │ 0
82252 │ ESH-RuleEngine-29                                                                              │ TIMED_WAITING │ 1        │ 0
82253 │ ESH-RuleEngine-30                                                                              │ TIMED_WAITING │ 1        │ 0

Rules (repeated 10 times)

rule "Test1"
when
    Item Virtual_Switch_1 received command
then
    logDebug("Rules","Test1: Start")
    Thread::sleep(10000)
    logDebug("Rules","Test1: End")
end

Log:

2018-09-27 23:00:49.340 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test2: Start
2018-09-27 23:00:49.343 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test3: Start
2018-09-27 23:00:49.345 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test6: Start
2018-09-27 23:00:49.346 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test10: Start
2018-09-27 23:00:49.348 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test9: Start
2018-09-27 23:00:59.342 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test2: End
2018-09-27 23:00:59.346 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test3: End
2018-09-27 23:00:59.352 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test4: Start
2018-09-27 23:00:59.349 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test9: End
2018-09-27 23:00:59.348 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test6: End
2018-09-27 23:00:59.347 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test10: End
2018-09-27 23:00:59.359 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test5: Start
2018-09-27 23:00:59.346 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test8: Start
2018-09-27 23:00:59.360 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test7: Start
2018-09-27 23:00:59.361 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test1: Start
2018-09-27 23:01:09.359 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test4: End
2018-09-27 23:01:09.361 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test8: End
2018-09-27 23:01:09.362 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test1: End
2018-09-27 23:01:09.361 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test5: End
2018-09-27 23:01:09.362 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - Test7: End

So, for some reason OH is not using the configured number of thread pools. Looks like a bug to me, so I opened an issue. I’m using snapshot 1372.

In case anyone is curious, JSR223-Jython does not have the same thread pool limitations, I setup 10 of these rules in JSR223-Jython (note the cool new decorators)…

@rule("Test: Test 1")
@when("Item Virtual_Switch_1 received command")
def threadTest1(event):
    log.debug("JSR223: Test 1: start")
    sleep(10)
    log.debug("JSR223: Test 1: end")

And here is the log:

2018-09-27 22:48:58.129 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 2: start
2018-09-27 22:48:58.130 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 9: start
2018-09-27 22:48:58.132 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 4: start
2018-09-27 22:48:58.134 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 8: start
2018-09-27 22:48:58.134 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 1: start
2018-09-27 22:48:58.135 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 10: start
2018-09-27 22:48:58.136 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 6: start
2018-09-27 22:48:58.137 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 7: start
2018-09-27 22:48:58.140 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 5: start
2018-09-27 22:48:58.141 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 3: start
2018-09-27 22:49:08.131 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 9: end
2018-09-27 22:49:08.132 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 2: end
2018-09-27 22:49:08.133 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 4: end
2018-09-27 22:49:08.134 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 8: end
2018-09-27 22:49:08.135 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 1: end
2018-09-27 22:49:08.135 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 10: end
2018-09-27 22:49:08.137 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 6: end
2018-09-27 22:49:08.137 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 7: end
2018-09-27 22:49:08.141 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 5: end
2018-09-27 22:49:08.141 [DEBUG] [org.eclipse.smarthome.model.script.Rules] - JSR223: Test 3: end
2 Likes

Thanks. The lower case “r” in ruleEngine helped. Now i have a larger pool, but like you, i just see 5 threads all the time.

I think i should try the JSR223 stuff. Do I need the snapshot or would 2.3 be ok to use these decorators?

These properties are case insensitve. Latest property wins.

With all the changes in the ESH Automation API, you should use the latest snapshot if you want to use JSR223. It can be used along side the Rules DSL. Those decorators are a preview :wink:! I hope to get a PR into the OH-Jython-Scripters repo this weekend. Everything is working, but there’s one more thing I’d like to improve.

Did you check the issue? Looks like the runtime.cfg entry should be…

org.eclipse.smarthome.threadpool:RuleEngine=NN

… but still does not work for me. Actually, the case doesn’t seem to matter… Karaf always shows the update, but only 5 rules can run at a time.

So, issue is reopened.

Thanks, I will test the JSR223 stuff when i move to the snapshot then.

1 Like

I’ll update the why have my rules stopped post. Thanks for chiming in.

Check the PR first. Looks like it is not case sensitive, and they’ll be changing the format to match the other pools.

1 Like

I’ve subscribed for updates and will watch it with interest. Thanks!

@job, do you have ruleEngine or RuleEngine in your /userdata/config/org/eclipse/smarthome/threadpool.config? As you’ll see in this comment, I changed mine to RuleEngine, restarted OH, and was then able to modify the rule engine threadpool size through the /conf/services/runtime.cfg.

It’s not clear how the capitalization issue happened, but I wonder how many people have changed their runtime.cfg and never realized that the changes hadn’t actually gone through.

1 Like

And figured that their system was running better for the change. Placebo is pretty powerful. :smiley:

1 Like

@5iver, it was the same on my side. Lower case ‘r’ in the internal config file. Changing that to uppercase made the propertywork.

This will be fun when the dev change the case of the ‘r’ to lowercase to be in line with the other options. :wink:

Now my threadpool is enlarged and i will observe the stuff a bit.

2 Likes