executeCommandLine very slow

jjmeseguer · March 20, 2023, 2:40pm

Platform information:
- Hardware: x86_64/16 GB
- OS: Ubuntu 22.04.2 LTS
- Java Runtime Environment: OpenJDK 64-Bit Server VM Zulu11.62+17-CA (build 11.0.18+10-LTS, mixed mode)
- openHAB version: 3.4.2

Hi,
Since I upgraded from OH 2.5 to OH3 I’m experiencing a extreme slowness when executing the first command in a rule with executeCommandLine. A single Linux “echo” command takes from 20 seconds to a minute and a half (variable), but subsequent commands in the same rule run normally fast.

As an example:

rule "Slow"
when
    Item Slow changed
then
    logInfo("SLOW_CMD","Start echo SLOW 1")
    logInfo("SLOW_CMD",executeCommandLine(Duration.ofSeconds(300),"time","echo","SLOW"))
    logInfo("SLOW_CMD","End echo SLOW 1")

    logInfo("SLOW_CMD","Start echo SLOW 2")
    logInfo("SLOW_CMD",executeCommandLine(Duration.ofSeconds(300),"time","echo","SLOW"))
    logInfo("SLOW_CMD","End echo SLOW 2")
end

I’m executing “time echo SLOW”, which prints “SLOW” and tells me how long it took. In the Linux console I get this:

SLOW

real    0m0,000s
user    0m0,000s
sys     0m0,000s

As you can see, the “echo” command takes almost no time to execute.

But in openHAB I’m getting results like this:

2023-03-20 15:33:28.639 [INFO ] [g.openhab.core.model.script.SLOW_CMD] - Start echo SLOW 1
2023-03-20 15:34:17.151 [INFO ] [g.openhab.core.model.script.SLOW_CMD] - SLOW
0.00user 0.00system 0:00.00elapsed 91%CPU (0avgtext+0avgdata 2080maxresident)k
0inputs+0outputs (0major+88minor)pagefaults 0swaps

2023-03-20 15:34:17.151 [INFO ] [g.openhab.core.model.script.SLOW_CMD] - End echo SLOW 1
2023-03-20 15:34:17.151 [INFO ] [g.openhab.core.model.script.SLOW_CMD] - Start echo SLOW 2
2023-03-20 15:34:17.158 [INFO ] [g.openhab.core.model.script.SLOW_CMD] - SLOW
0.00user 0.00system 0:00.00elapsed 86%CPU (0avgtext+0avgdata 2060maxresident)k
0inputs+0outputs (0major+87minor)pagefaults 0swaps

2023-03-20 15:34:17.159 [INFO ] [g.openhab.core.model.script.SLOW_CMD] - End echo SLOW 2

The first echo command was immediate according to time (and the time command output is different, I don’t know why) but it took openHAB almost 50 seconds to return from execution.

The second echo command was fast.

Anybody else is experiencing this? Is there any way to debug it?

Thanks.

rlkoshak · March 20, 2023, 3:06pm

I’ve not seen any other reports of this behavior. What’s the status of your machine? It’s not out of RAM and using swap?

It is known that in OH 3 the first time a rule executes can take longer than subsequent runs but 20-60 seconds is way outside of the times I’ve seen reported.

What else is OH doing when this rule first runs? Is OH fully up and settled or do you trigger it while it’s still loading Things and such?

jjmeseguer · March 20, 2023, 3:20pm

Hi Rich,
No, there’s RAM and swap available:

cat /proc/meminfo
MemTotal:       16327316 kB
MemFree:         1070828 kB
MemAvailable:    5470904 kB
Buffers:          682948 kB
Cached:          3891080 kB
SwapCached:       194636 kB
Active:          4794716 kB
Inactive:        9133252 kB
Active(anon):    1810636 kB
Inactive(anon):  7829364 kB
Active(file):    2984080 kB
Inactive(file):  1303888 kB
Unevictable:          96 kB
Mlocked:              96 kB
SwapTotal:      16777212 kB
SwapFree:       15358460 kB
Dirty:               156 kB
Writeback:             0 kB
AnonPages:       9280356 kB
Mapped:          1302840 kB
Shmem:            295612 kB
KReclaimable:     447504 kB
Slab:             763136 kB
SReclaimable:     447504 kB
SUnreclaim:       315632 kB
KernelStack:       46192 kB
PageTables:        97028 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    24940868 kB
Committed_AS:   28261932 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      123564 kB
VmallocChunk:          0 kB
Percpu:            48128 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:     1761048 kB
DirectMap2M:    12867584 kB
DirectMap1G:     3145728 kB

And OH is settled. It always happens, not only from time to time.
How can I know what OH is doing? Lots of items get updated an rules run in the background.
But, why does it take time the first time executeCommandLine is run and not the others?
Thank you.

rlkoshak · March 20, 2023, 4:07pm

It’s not how much swap is available. If swap is being used at all that’s going to result in a slow machine because it’s paging stuff that should be in RAM out to disk and disk is slow.

The logs. Also check the log4j2.xml file where a lot of logging is disabled. You can turn up the logging by setting logger to DEBUG or TRACE to see more.

I’ve never seen this and I’ve never seen it reported by others. All I can go on are generic ideas on things that could cause it.

jjmeseguer · March 20, 2023, 8:12pm

Hi,
I’ve figured it out!
I rebooted, turned swap off and everything was the same.
So I disabled all the rules except this one and it worked well, so it had to be a problem with the rules.
I enabled them one by one until the problem reappeared.
Turns out I had some rules from a long time ago to measure network latencies that were launched every minute and performed pings for a whole minute, so they were always running, and maybe some of them more than once.
I’ve changed that to the latency channel in the network:pingdevice things and everything is back to normal.
Thank you very much!

rlkoshak · March 20, 2023, 8:42pm

In OH 3 that can’t happen any more. Only one instance of any given rule can run at a given time. Subsequent triggers of the rule will be queued up and worked off in sequence.

I’m glad you figure it out!