CPU Load - suddenly high from a single rule

NCO · March 23, 2019, 9:45am

All,

I recently changed some rules and uploaded them to my OH 2.4 stable running on openhabian.
After the reload of the rules, the CPU load went up to about 40 to 80% (usually 5 to 15% with some peaks up to 40%).

For troubleshooting I removed all rules and copied them back one by one.
It turned out, that a single rule within a rule file causes this CPU load jump:

rule "Retrieve car response"
when
Item CarUpdate received command OFF or
Time cron "15 */5 * * * ?"
then
CarResponse.postUpdate("updating...")
var String json = executeCommandLine("/usr/local/bin/voc -u 'username' -p 'password' dashboard", 20000)
CarResponse.postUpdate(json)
if(CarUpdate.state != ON) CarUpdate.postUpdate(ON)
end

For verification, I move this rule to another rule file, but the error migrated accordingly.

It’s a simple rule, so I wonder, where this might come from.
Is there anything obvious in this rule, which might cause higher CPU load?

OMR · March 25, 2019, 1:21pm

Start your rule body with:
logInfo("Test", "Car response1")
and just after the executecommandLine as well
logInfo("Test", "Car response2")

and check with:
tail -F /var/log/openhab2/openhab.log

Verify that the rule is not called more often than it should, and that your script actually returns.

NCO · March 27, 2019, 12:31pm

Thanks, I will try.
But actually without the temporary postUpdate:

The CPU load is on a regular level.
(in the consecutive rule i used to have the following, whicih I also removed:

if(CarResponse.state == "updating...") {
logInfo("car_state.rules", "json - updating... processing skipped!")
return; // failing fast
}

NCO · March 27, 2019, 1:24pm

After adding your suggestions and using the temporary postUpdate again the CPU load went up from ~12% to 45%.

The log shows accordingly every 5 minutes the following sequence, which looks ok to me:

2019-03-27 14:10:15.027 [INFO ] [arthome.model.script.car_state.rules] - request CarResponse...
2019-03-27 14:10:15.037 [INFO ] [.eclipse.smarthome.model.script.Test] - Car response1
2019-03-27 14:10:15.112 [INFO ] [arthome.model.script.car_state.rules] - json - updating... processing skipped!
2019-03-27 14:10:28.580 [INFO ] [.eclipse.smarthome.model.script.Test] - Car response2
2019-03-27 14:10:28.652 [INFO ] [arthome.model.script.car_state.rules] - CarResponse valid - processing started.

I also verified that no other rule is triggered with “CarResponse changed / updated”

rossko57 · March 27, 2019, 1:42pm

So it takes about 13 seconds to fetch the json, during which time the rule is paused, waiting for it to complete.

This all seems normal enough.

Tell us more about this ‘consecutive’ rule, presumably it analyses the json … but why does it kick off before json fetch complete?

This all sounds like it might benefit from setting up with exec binding, where you trigger a Thing to do the the work, and get a state change when its done.

NCO · March 27, 2019, 1:47pm

the consecutive rule looks like this:

rule "Evaluate car response"
when
Item CarResponse changed // received update
then
if(previousState == NULL) return;
if(CarResponse.state == "updating...") {
logInfo("car_state.rules", "json - updating... processing skipped!")
return; // failing fast
}
// if car offline -&gt; return
if(CarResponse.state.toString.contains("Could not connect to the server")) {
if(CarThing.state != OFF) CarThing.postUpdate(OFF)
logInfo("car_state.rules", "json - Could not Connect - Pulpo Status processing skipped!")
return; // failing fast
}
if(CarResponse.state == "") {
if(CarThing.state != OFF) CarThing.postUpdate(OFF)
logInfo("car_state.rules", "json - empty - Pulpo Status processing skipped!")
return; // failing fast
}
if(CarResponse.state === null) {
if(CarThing.state != OFF) CarThing.postUpdate(OFF)
logInfo("car_state.rules", "json - null - Pulpo Status processing skipped!")
return; // failing fast
}
logInfo("car_state.rules", "CarResponse valid - processing started.")
...

rossko57 · March 27, 2019, 2:38pm

Okay, that seems straightforward.

Your original rule seems to be doing what it is supposed to do, hang up and wait for a long external process.

If you don’t want the rule to hang up, I would recommend using the exec binding to offload the job asynchronously.

NCO · March 28, 2019, 5:18am

So you mean the hang-up (waiting) will cause the cpu load?

rossko57 · March 28, 2019, 10:39am

I don’t know; you can see whatever it is you are measuring and when that occurs. What do you think?

NCO · March 28, 2019, 11:37am

I will try to replace my approach with an external execute commendline and will see what happens.
Thanks for your help digging into this issue.