I added hint to the README ![]()
Hereās all 3 tests in ruby:
- With local variable (passed through closure)
- With ruby instance var (the functional equivalent of āprivate cacheā, but ruby-native)
- With shared_cache
I didnāt try with method pointer caching, I suspect it could be worse than just the normal way.
gemfile do
source "https://rubygems.org"
require "benchmark"
end
counter = 0.2
rate = 3.54409
tests = {
direct_var: proc do
counter = 0.2
1_000_000.times { counter = rate * counter * (1 - counter) }
end,
# This is the equivalent of "private cache" in Ruby
with_instance_var: proc do
@counter = 0.2
1_000_000.times { @counter = rate * @counter * (1 - @counter) }
end,
with_shared_cache: proc do
counter = 0.2
shared_cache["ruby_counter"] = counter
100_000.times do
counter = shared_cache["ruby_counter"]
counter = rate * counter * (1 - counter)
shared_cache["ruby_counter"] = counter
end
end
}
tests.each do |name, test|
times = []
10.times do
results = Benchmark.measure(&test)
times << (results.real * 1000)
end
min = times.min
max = times.max
avg = times.sum / times.size
stddev = Math.sqrt(times.map { |t| (t - avg)**2 }.sum / (times.size - 1))
stats = "Min: #{min}, Max: #{max}, Avg: #{avg}, StdDev: #{stddev}"
logger.info("Performance Test: JRuby #{name} over 10 runs - #{stats}")
end
The results on my system:
| Test | Min | Max | Avg | StdDev |
|---|---|---|---|---|
| JRuby direct_var | 59.69169 | 118.49086 | 72.26167 | 17.30205 |
| JRuby with_instance_var | 46.64773 | 127.97687 | 65.01156 | 25.70937 |
| JRuby with_shared_cache | 86.90746 | 304.18391 | 121.51239 | 65.35198 |
Iād be interested to see the comparison for all the 3 scripting languages!
I ran the tests a few times to get a good pool of data:
Performance Test: JRuby direct_var over 10 runs - Min: 59.77494700027819, Max: 972.2754539998277, Avg: 224.44624529994144, StdDev: 343.3435493896444
Performance Test: JRuby with_instance_var over 10 runs - Min: 51.71431000007942, Max: 148.5308750002332, Avg: 70.18336950009143, StdDev: 28.021690845482276
Performance Test: JRuby with_shared_cache over 10 runs - Min: 93.13241100016967, Max: 1418.0885140003738, Avg: 252.0667751000019, StdDev: 416.2802546432774
Performance Test: JRuby direct_var over 10 runs - Min: 56.88069500001802, Max: 318.3457830000407, Avg: 148.63553480004157, StdDev: 109.19852351433106
Performance Test: JRuby with_instance_var over 10 runs - Min: 65.13245900032416, Max: 180.67353800006458, Avg: 78.52545400014606, StdDev: 36.072321257767264
Performance Test: JRuby with_shared_cache over 10 runs - Min: 94.63462199983042, Max: 380.751489999966, Avg: 126.22948079992966, StdDev: 89.60836106405517
Performance Test: JRuby direct_var over 10 runs - Min: 61.96413600036976, Max: 348.3102499999404, Avg: 137.58803659998193, StdDev: 115.31943290868999
Performance Test: JRuby with_instance_var over 10 runs - Min: 70.8377850000943, Max: 115.04310500004067, Avg: 77.41909719998148, StdDev: 13.877835612044093
Performance Test: JRuby with_shared_cache over 10 runs - Min: 95.3076380001221, Max: 350.4971600000317, Avg: 123.05286670002715, StdDev: 79.98753543054225
Performance Test: JRuby direct_var over 10 runs - Min: 58.436179000182165, Max: 140.45948000011776, Avg: 69.13643700004286, StdDev: 26.111232353872058
Performance Test: JRuby with_instance_var over 10 runs - Min: 65.80606899979102, Max: 123.53140100003657, Avg: 73.08309289996942, StdDev: 17.99344749515333
Performance Test: JRuby with_shared_cache over 10 runs - Min: 92.99507200012158, Max: 349.2820279998341, Avg: 124.77302929996767, StdDev: 80.01232656292358
Performance Test: JRuby direct_var over 10 runs - Min: 58.49517900014689, Max: 332.3918980004237, Avg: 87.5094650000392, StdDev: 86.17082447108734
Performance Test: JRuby with_instance_var over 10 runs - Min: 51.03279000013572, Max: 101.47395400008463, Avg: 61.68899290000809, StdDev: 15.460840480939908
Performance Test: JRuby with_shared_cache over 10 runs - Min: 93.92476900029578, Max: 365.690060000361, Avg: 133.12143970006218, StdDev: 83.13551406585765
Performance Test: JRuby direct_var over 10 runs - Min: 56.0829940000076, Max: 125.28985500011913, Avg: 65.2007956000034, StdDev: 21.859571930349905
Performance Test: JRuby with_instance_var over 10 runs - Min: 54.50900599998931, Max: 95.18252099996971, Avg: 65.27544320001653, StdDev: 12.460975017093016
Performance Test: JRuby with_shared_cache over 10 runs - Min: 99.36349000008704, Max: 367.04925999993065, Avg: 137.9926334999709, StdDev: 81.8816941356595
Also for fairness, these are the times for the other scripts today:
As before, runs with no cache read/writes are looped 1,000,000x, cache runs 100,000x
JS cache 253ms
JS no cache 116ms
Python cache 360ms
Python no cache 617ms
Groovy cache 465ms
Groovy no cache 5-9ms
I think the big takeaway is that cache reads are expensive no matter the language. Probably other API calls too. The no-cache runs give an idea of the raw math performance.
I like that you all are doing these tests. But I want to mention something for future readers who might over interpret these results.
In an average rule, the cache will only be accessed a handful of times. These tests are measuring hundreds of accesses per run. So the practical impact on an average rule will be much less than the numbers imply.
For example, while it seems to run twice as slowly in JS when using the cache, the comparison between an average rule that does and an average rule that doesnāt will be much less pronounced.
But I do like the idea of updating the docs to mention if one is experiencing a performance problem, assigning a cached entry to a local variable is a good thing to try first. But a few accesses of the cache in the same rule isnāt going to suddenly make ti slow beyond belief.
I was wondering why the instance variable is faster than the non instance var version, and I suspected that the closure resolution might have something to do with it. So hereās a revised version without constantly updating a variable from the closure scope, and yes itās now faster than the instance var albeit only by a bit.
Also adding a warmup run makes a big difference in the stddev and the max times.
gemfile do
source "https://rubygems.org"
require "benchmark"
end
tests = {
direct_var: proc do
rate = 3.54409
counter = 0.2
1_000_000.times do
counter = rate * counter * (1 - counter)
end
end,
# This is the equivalent of "private cache" in Ruby
with_instance_var: proc do
@counter = 0.2
rate = 3.54409
1_000_000.times { @counter = rate * @counter * (1 - @counter) }
end,
with_shared_cache: proc do
rate = 3.54409
shared_cache["ruby_counter"] = 0.2
100_000.times do
counter = shared_cache["ruby_counter"]
counter = rate * counter * (1 - counter)
shared_cache["ruby_counter"] = counter
end
end
}
tests.each do |name, test|
times = []
test.call # warmup
10.times do
results = Benchmark.measure(&test)
times << (results.real * 1000)
end
min = times.min
max = times.max
avg = times.sum / times.size
stddev = Math.sqrt(times.map { |t| (t - avg)**2 }.sum / (times.size - 1))
stats = "Min: #{min}, Max: #{max}, Avg: #{avg}, StdDev: #{stddev}"
logger.info("Performance Test: JRuby #{name} over 10 runs - #{stats}")
end
Result on my system:
| Test | Min | Max | Avg | StdDev |
|---|---|---|---|---|
| JRuby direct_var | 40.75746 | 52.68585 | 44.56999 | 3.88265 |
| JRuby with_instance_var | 50.24885 | 67.74754 | 58.10986 | 6.17004 |
| JRuby with_shared_cache | 85.04925 | 155.85034 | 109.55998 | 21.09122 |
This is from the test without warmup (done by @Johnno in this post and this):
| Language | With Cache | Without Cache |
|---|---|---|
| Groovy | 465 ms | 5ā9 ms |
| JRuby | 123 ms | 78 ms |
| JS | 253 ms | 116 ms |
| Python | 360 ms | 617 ms |
The whole āwarmupā concept is interesting too. For the data points Iāve given, mostly they are after a few runs and the script has settled into a consistent result.
I booted up the system and ran the 10^6 loops no cache scripts a few times for each language until the execution time settled:
| Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | |
|---|---|---|---|---|---|
| JS | 150 | 90 | |||
| Python | 1300 | 850 | 620 | ||
| Groovy | 1100 | 500 | 7 | ||
| JRuby | 1350 | 640 | 720 | 250 | 130 |
So I have still have a lot to learn about what is going on under the hood, in terms of the script being loaded into memory, and whether the JVM does some repetitive code optimisation. eg Groovy is quite slow for the initial runs, but then has a 50-100x improvement in speed.
I havenāt used JRuby as much since I havenāt coded in Ruby before, but I will probably read up on it more now since it does seem to have pretty good performance. Which I am guessing is because it is Ruby which compiles to run on the JVM?
Going on from Rich was saying, speaking from personal experience when I first started using OH, there was very guidance or reason why there are multiple scripting environments, and what to use. Hence why I wrote everything in the default install option of JS. Later I figured out the reason for multiple scripting languages is ābecause the addon developer thinks itās awesome and wants to make oneā.
For the newcomers, it may end up being a choice lost in the sea of choices, so hopefully this thread illustrates a few particulars of each. JRuby seems to have quite efficient cache access and generally good computational power. Groovy very much excels at computation, but lacks in higher level inbuilt functions that are present in JS or Python. Hopefully that is a fair assessment.
Is that the execution time? Weird that JRuby is doing poorly there. That doesnāt match your earlier results.
To do a āwarmupā you have to run the same code path within the same engine instance. Thatās how my last script above did. If you reload the script, it would create a whole new scripting engine instance all over again, so thatās not a warm up.
JRuby was written from the ground up in Java. It is now developed āin paralelā keeping up with the changes in the āC-Rubyā (the one written in C, a.k.a the standard ruby a.k.a. MRI).
Many things in JRuby and the openhab helper library are direct interface to the actual openHAB Java objects instead of wrappers.
Indeed. JRuby was created by Rubyists, people who LOVE the Ruby language. So it was designed to make a Rubyist feel most at home, instead of making it just a 1-to-1 port of the Jython helper library at the time. For every feature āweā asked ourselves, āhow should this be done in the Ruby world?ā.
When the JRuby helper library was first introduced, I had never programmed in Ruby and didnāt know anything about it. But I fell in love with the language and joined the team to help maintain the library along with @ccutrer who is a Master in Ruby and a great programmer in general. Iāve learned a lot since then, even though I still have a lot more to learn.
The performance is not why we use Ruby, although it is an important factor in the design of the helper library. Ruby is why we want to use Ruby ![]()
The JRuby helper library, combined with the Ruby language itself make writing openHAB automation so much fun and so much easier. I think having to learn Ruby if youāre not familiar with it, is a price very much worth paying. I now feel depressed when having to code in another language, missing all the niceties offered by Ruby.
In regards to the large spread of execution times, I had noticed it with your script. I guessed that script was pulling in remote code from rubygems.org and might be a culprit even though from the script structure it shouldnāt affect the code being benchmarked.
I changed it to go off the system clock and right away it will run between 85-95ms, with a regular occurrence of the script going up to 120-170ms for a run or two. I have noticed that behaviour with other script languages too. The technical term would be OH is doing āother stuffā ![]()
counter = 0.2
rate = 3.54409
start_time = Time.now.to_f
1_000_000.times { counter = rate * counter * (1 - counter) }
end_time = Time.now.to_f
logger.info("Performance Test JRuby: #{(end_time - start_time)*1000} milliseconds")
For JS and Python, we use GraalJS resp. GraalPython from Oracleās GraalVM project. Both languages are reimplementations in Java, so both languages benefit from warm-up as this allows the JVM to optimise the code.
Performance at the moment is not at the theoretical maximum because running on stock OpenJDK builds and not GraalVM JVM, those languages run in interpreter mode only. They are actually build to be compiled by the GraalVM compiler. According to Oracle, GraalVM performance can exceed Node.js in some cases.
If you are curious you could try running on GraalVM and benchmark there.
Everyone who is able to create an automation add-on may do so, this is how we ended up with that many languages. I think this is a great development as people can use what they are used to or what they like, and they arenāt forced to learn a specific language. But for new users this might be quite confusing.
I for my part use JS as itās similar to Java and itās one of my major languages, but I have to admit other languages have nice features to offer (either in the language itself or the helper library).
BTW: A few days ago a PR has been opened that adds support for using Java as automation/scripting language as well. Kind of funny that we have supported that many languages before supporting Java, even though openHAB is written in Java.
Updated table please ![]()
I just re-ran the script many times around once per second, as you can see 85-95 is probably a fair baseline, then there are intermittent spikes.
I may have to pick this up tomorrow after a cold boot to see if the script exhibits the same behaviour on first run.
122, 84, 88, 87, 160, 86, 87, 86, 90, 126, 88, 87, 124, 88, 193, 138, 89, 111, 93, 133, 110 99
@Johnno How is it possible that python is slower without cache then with cache?
In term of ānativeā, may I interest you in the java223 automation bundle
?
You cannot be more native than that.
(And as so, I would be very surprised if it doesnāt rank 1 in performance)
I made a PR yesterday and hope to make it an official openHAB JSR223 script.
Cached runs are looped 100,000x while, non-cache runs are looped 1,000,000x. So I think itās now coming up against the native computation and performance of python.
Iāll do some logging today with the Pi 5 freshly started.
| Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | |
|---|---|---|---|---|---|---|---|---|---|---|
| No Cache 10^6 Runs | ||||||||||
| Groovy | 1036 | 516 | 6 | 8 | 8 | 8 | 9 | 9 | 10 | 9 |
| JS | 137 | 94 | 95 | 94 | 94 | 104 | 126 | 98 | 94 | 94 |
| Python | 616 | 587 | 602 | 631 | 619 | 645 | 581 | 578 | 606 | 618 |
| Jruby | 230 | 203 | 164 | 119 | 169 | 266 | 130 | 198 | 108 | 160 |
| Cache 10^5 Runs | ||||||||||
| Groovy | 674 | 1183 | 729 | 721 | 648 | 699 | 673 | 649 | 679 | 639 |
| JS | 414 | 298 | 244 | 307 | 276 | 245 | 248 | 275 | 240 | 303 |
| Python | 516 | 316 | 340 | 315 | 398 | 376 | 326 | 319 | 312 | 358 |
| Jruby | 593 | 240 | 181 | 194 | 167 | 173 | 170 | 193 | 211 | 170 |
Hereās the data for the the first 10 runs of each script on a newly booted system. I also modified the cache script for Ruby to remove the remote code dependency and run off the system clock just like the other script.
I would group/sort the table by cache and no cache.