Is Groovy the most native scripting language for OH?

@Johnno

I added hint to the README :slight_smile:

1 Like

Here’s all 3 tests in ruby:

  • With local variable (passed through closure)
  • With ruby instance var (the functional equivalent of ā€œprivate cacheā€, but ruby-native)
  • With shared_cache

I didn’t try with method pointer caching, I suspect it could be worse than just the normal way.

gemfile do
  source "https://rubygems.org"
  require "benchmark"
end

counter = 0.2
rate = 3.54409

tests = {
  direct_var: proc do
    counter = 0.2
    1_000_000.times { counter = rate * counter * (1 - counter) }
  end,

  # This is the equivalent of "private cache" in Ruby
  with_instance_var: proc do
    @counter = 0.2
    1_000_000.times { @counter = rate * @counter * (1 - @counter) }
  end,

  with_shared_cache: proc do
    counter = 0.2
    shared_cache["ruby_counter"] = counter
    100_000.times do
      counter = shared_cache["ruby_counter"]
      counter = rate * counter * (1 - counter)
      shared_cache["ruby_counter"] = counter
    end
  end
}

tests.each do |name, test|
  times = []
  10.times do
    results = Benchmark.measure(&test)
    times << (results.real * 1000)
  end

  min = times.min
  max = times.max
  avg = times.sum / times.size
  stddev = Math.sqrt(times.map { |t| (t - avg)**2 }.sum / (times.size - 1))

  stats = "Min: #{min}, Max: #{max}, Avg: #{avg}, StdDev: #{stddev}"
  logger.info("Performance Test: JRuby #{name} over 10 runs - #{stats}")
end

The results on my system:

Test Min Max Avg StdDev
JRuby direct_var 59.69169 118.49086 72.26167 17.30205
JRuby with_instance_var 46.64773 127.97687 65.01156 25.70937
JRuby with_shared_cache 86.90746 304.18391 121.51239 65.35198

I’d be interested to see the comparison for all the 3 scripting languages!

I ran the tests a few times to get a good pool of data:

Performance Test: JRuby direct_var over 10 runs - Min: 59.77494700027819, Max: 972.2754539998277, Avg: 224.44624529994144, StdDev: 343.3435493896444
Performance Test: JRuby with_instance_var over 10 runs - Min: 51.71431000007942, Max: 148.5308750002332, Avg: 70.18336950009143, StdDev: 28.021690845482276
Performance Test: JRuby with_shared_cache over 10 runs - Min: 93.13241100016967, Max: 1418.0885140003738, Avg: 252.0667751000019, StdDev: 416.2802546432774

Performance Test: JRuby direct_var over 10 runs - Min: 56.88069500001802, Max: 318.3457830000407, Avg: 148.63553480004157, StdDev: 109.19852351433106
Performance Test: JRuby with_instance_var over 10 runs - Min: 65.13245900032416, Max: 180.67353800006458, Avg: 78.52545400014606, StdDev: 36.072321257767264
Performance Test: JRuby with_shared_cache over 10 runs - Min: 94.63462199983042, Max: 380.751489999966, Avg: 126.22948079992966, StdDev: 89.60836106405517

Performance Test: JRuby direct_var over 10 runs - Min: 61.96413600036976, Max: 348.3102499999404, Avg: 137.58803659998193, StdDev: 115.31943290868999
Performance Test: JRuby with_instance_var over 10 runs - Min: 70.8377850000943, Max: 115.04310500004067, Avg: 77.41909719998148, StdDev: 13.877835612044093
Performance Test: JRuby with_shared_cache over 10 runs - Min: 95.3076380001221, Max: 350.4971600000317, Avg: 123.05286670002715, StdDev: 79.98753543054225

Performance Test: JRuby direct_var over 10 runs - Min: 58.436179000182165, Max: 140.45948000011776, Avg: 69.13643700004286, StdDev: 26.111232353872058
Performance Test: JRuby with_instance_var over 10 runs - Min: 65.80606899979102, Max: 123.53140100003657, Avg: 73.08309289996942, StdDev: 17.99344749515333
Performance Test: JRuby with_shared_cache over 10 runs - Min: 92.99507200012158, Max: 349.2820279998341, Avg: 124.77302929996767, StdDev: 80.01232656292358

Performance Test: JRuby direct_var over 10 runs - Min: 58.49517900014689, Max: 332.3918980004237, Avg: 87.5094650000392, StdDev: 86.17082447108734
Performance Test: JRuby with_instance_var over 10 runs - Min: 51.03279000013572, Max: 101.47395400008463, Avg: 61.68899290000809, StdDev: 15.460840480939908
Performance Test: JRuby with_shared_cache over 10 runs - Min: 93.92476900029578, Max: 365.690060000361, Avg: 133.12143970006218, StdDev: 83.13551406585765

Performance Test: JRuby direct_var over 10 runs - Min: 56.0829940000076, Max: 125.28985500011913, Avg: 65.2007956000034, StdDev: 21.859571930349905
Performance Test: JRuby with_instance_var over 10 runs - Min: 54.50900599998931, Max: 95.18252099996971, Avg: 65.27544320001653, StdDev: 12.460975017093016
Performance Test: JRuby with_shared_cache over 10 runs - Min: 99.36349000008704, Max: 367.04925999993065, Avg: 137.9926334999709, StdDev: 81.8816941356595
1 Like

Also for fairness, these are the times for the other scripts today:

As before, runs with no cache read/writes are looped 1,000,000x, cache runs 100,000x

JS cache 253ms
JS no cache 116ms

Python cache 360ms
Python no cache 617ms

Groovy cache 465ms
Groovy no cache 5-9ms

I think the big takeaway is that cache reads are expensive no matter the language. Probably other API calls too. The no-cache runs give an idea of the raw math performance.

1 Like

I like that you all are doing these tests. But I want to mention something for future readers who might over interpret these results.

In an average rule, the cache will only be accessed a handful of times. These tests are measuring hundreds of accesses per run. So the practical impact on an average rule will be much less than the numbers imply.

For example, while it seems to run twice as slowly in JS when using the cache, the comparison between an average rule that does and an average rule that doesn’t will be much less pronounced.

But I do like the idea of updating the docs to mention if one is experiencing a performance problem, assigning a cached entry to a local variable is a good thing to try first. But a few accesses of the cache in the same rule isn’t going to suddenly make ti slow beyond belief.

4 Likes

I was wondering why the instance variable is faster than the non instance var version, and I suspected that the closure resolution might have something to do with it. So here’s a revised version without constantly updating a variable from the closure scope, and yes it’s now faster than the instance var albeit only by a bit.

Also adding a warmup run makes a big difference in the stddev and the max times.

gemfile do
  source "https://rubygems.org"
  require "benchmark"
end

tests = {
  direct_var: proc do
    rate = 3.54409
    counter = 0.2
    1_000_000.times do
      counter = rate * counter * (1 - counter)
    end
  end,

  # This is the equivalent of "private cache" in Ruby
  with_instance_var: proc do
    @counter = 0.2
    rate = 3.54409
    1_000_000.times { @counter = rate * @counter * (1 - @counter) }
  end,

  with_shared_cache: proc do
    rate = 3.54409
    shared_cache["ruby_counter"] = 0.2
    100_000.times do
      counter = shared_cache["ruby_counter"]
      counter = rate * counter * (1 - counter)
      shared_cache["ruby_counter"] = counter
    end
  end
}

tests.each do |name, test|
  times = []

  test.call # warmup

  10.times do
    results = Benchmark.measure(&test)
    times << (results.real * 1000)
  end

  min = times.min
  max = times.max
  avg = times.sum / times.size
  stddev = Math.sqrt(times.map { |t| (t - avg)**2 }.sum / (times.size - 1))

  stats = "Min: #{min}, Max: #{max}, Avg: #{avg}, StdDev: #{stddev}"
  logger.info("Performance Test: JRuby #{name} over 10 runs - #{stats}")
end

Result on my system:

Test Min Max Avg StdDev
JRuby direct_var 40.75746 52.68585 44.56999 3.88265
JRuby with_instance_var 50.24885 67.74754 58.10986 6.17004
JRuby with_shared_cache 85.04925 155.85034 109.55998 21.09122

This is from the test without warmup (done by @Johnno in this post and this):

Language With Cache Without Cache
Groovy 465 ms 5–9 ms
JRuby 123 ms 78 ms
JS 253 ms 116 ms
Python 360 ms 617 ms

The whole ā€œwarmup’ concept is interesting too. For the data points I’ve given, mostly they are after a few runs and the script has settled into a consistent result.

I booted up the system and ran the 10^6 loops no cache scripts a few times for each language until the execution time settled:

Run 1 Run 2 Run 3 Run 4 Run 5
JS 150 90
Python 1300 850 620
Groovy 1100 500 7
JRuby 1350 640 720 250 130

So I have still have a lot to learn about what is going on under the hood, in terms of the script being loaded into memory, and whether the JVM does some repetitive code optimisation. eg Groovy is quite slow for the initial runs, but then has a 50-100x improvement in speed.

I haven’t used JRuby as much since I haven’t coded in Ruby before, but I will probably read up on it more now since it does seem to have pretty good performance. Which I am guessing is because it is Ruby which compiles to run on the JVM?

Going on from Rich was saying, speaking from personal experience when I first started using OH, there was very guidance or reason why there are multiple scripting environments, and what to use. Hence why I wrote everything in the default install option of JS. Later I figured out the reason for multiple scripting languages is ā€œbecause the addon developer thinks it’s awesome and wants to make oneā€.

For the newcomers, it may end up being a choice lost in the sea of choices, so hopefully this thread illustrates a few particulars of each. JRuby seems to have quite efficient cache access and generally good computational power. Groovy very much excels at computation, but lacks in higher level inbuilt functions that are present in JS or Python. Hopefully that is a fair assessment.

Is that the execution time? Weird that JRuby is doing poorly there. That doesn’t match your earlier results.

To do a ā€œwarmupā€ you have to run the same code path within the same engine instance. That’s how my last script above did. If you reload the script, it would create a whole new scripting engine instance all over again, so that’s not a warm up.

JRuby was written from the ground up in Java. It is now developed ā€œin paralelā€ keeping up with the changes in the ā€œC-Rubyā€ (the one written in C, a.k.a the standard ruby a.k.a. MRI).

Many things in JRuby and the openhab helper library are direct interface to the actual openHAB Java objects instead of wrappers.

Indeed. JRuby was created by Rubyists, people who LOVE the Ruby language. So it was designed to make a Rubyist feel most at home, instead of making it just a 1-to-1 port of the Jython helper library at the time. For every feature ā€œweā€ asked ourselves, ā€œhow should this be done in the Ruby world?ā€.

When the JRuby helper library was first introduced, I had never programmed in Ruby and didn’t know anything about it. But I fell in love with the language and joined the team to help maintain the library along with @ccutrer who is a Master in Ruby and a great programmer in general. I’ve learned a lot since then, even though I still have a lot more to learn.

The performance is not why we use Ruby, although it is an important factor in the design of the helper library. Ruby is why we want to use Ruby :slight_smile:

The JRuby helper library, combined with the Ruby language itself make writing openHAB automation so much fun and so much easier. I think having to learn Ruby if you’re not familiar with it, is a price very much worth paying. I now feel depressed when having to code in another language, missing all the niceties offered by Ruby.

1 Like

In regards to the large spread of execution times, I had noticed it with your script. I guessed that script was pulling in remote code from rubygems.org and might be a culprit even though from the script structure it shouldn’t affect the code being benchmarked.

I changed it to go off the system clock and right away it will run between 85-95ms, with a regular occurrence of the script going up to 120-170ms for a run or two. I have noticed that behaviour with other script languages too. The technical term would be OH is doing ā€˜other stuff’ :grinning_face_with_smiling_eyes:

counter = 0.2
rate = 3.54409
start_time = Time.now.to_f
1_000_000.times { counter = rate * counter * (1 - counter) }
end_time = Time.now.to_f
logger.info("Performance Test JRuby: #{(end_time - start_time)*1000} milliseconds")

For JS and Python, we use GraalJS resp. GraalPython from Oracleā€˜s GraalVM project. Both languages are reimplementations in Java, so both languages benefit from warm-up as this allows the JVM to optimise the code.
Performance at the moment is not at the theoretical maximum because running on stock OpenJDK builds and not GraalVM JVM, those languages run in interpreter mode only. They are actually build to be compiled by the GraalVM compiler. According to Oracle, GraalVM performance can exceed Node.js in some cases.
If you are curious you could try running on GraalVM and benchmark there.

Everyone who is able to create an automation add-on may do so, this is how we ended up with that many languages. I think this is a great development as people can use what they are used to or what they like, and they aren’t forced to learn a specific language. But for new users this might be quite confusing.
I for my part use JS as it’s similar to Java and it’s one of my major languages, but I have to admit other languages have nice features to offer (either in the language itself or the helper library).

BTW: A few days ago a PR has been opened that adds support for using Java as automation/scripting language as well. Kind of funny that we have supported that many languages before supporting Java, even though openHAB is written in Java.

1 Like

Updated table please :folded_hands:

I just re-ran the script many times around once per second, as you can see 85-95 is probably a fair baseline, then there are intermittent spikes.

I may have to pick this up tomorrow after a cold boot to see if the script exhibits the same behaviour on first run.

122, 84, 88, 87, 160, 86, 87, 86, 90, 126, 88, 87, 124, 88, 193, 138, 89, 111, 93, 133, 110 99

@Johnno How is it possible that python is slower without cache then with cache?

In term of ā€œnativeā€, may I interest you in the java223 automation bundle :wink:?

You cannot be more native than that.
(And as so, I would be very surprised if it doesn’t rank 1 in performance)

I made a PR yesterday and hope to make it an official openHAB JSR223 script.

3 Likes

Cached runs are looped 100,000x while, non-cache runs are looped 1,000,000x. So I think it’s now coming up against the native computation and performance of python.

I’ll do some logging today with the Pi 5 freshly started.

Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10
No Cache 10^6 Runs
Groovy 1036 516 6 8 8 8 9 9 10 9
JS 137 94 95 94 94 104 126 98 94 94
Python 616 587 602 631 619 645 581 578 606 618
Jruby 230 203 164 119 169 266 130 198 108 160
Cache 10^5 Runs
Groovy 674 1183 729 721 648 699 673 649 679 639
JS 414 298 244 307 276 245 248 275 240 303
Python 516 316 340 315 398 376 326 319 312 358
Jruby 593 240 181 194 167 173 170 193 211 170

Here’s the data for the the first 10 runs of each script on a newly booted system. I also modified the cache script for Ruby to remove the remote code dependency and run off the system clock just like the other script.

1 Like

I would group/sort the table by cache and no cache.

1 Like