Possible performance of bindings?

Has anybody an idea if there is already a binding which is very fast and which could be used in order to test?

We could test with mqtt, but normally I would expect that mqtt is much slower than modbus/tcp. Fastest would be shared memory or just a dummy binding wich returns a constant. Is there one?

Hum, I do not think the Modbus implementation works like that here. In keeping with Modbus’ history (originating as a shared serial bus) it waits for each poll query/response to complete (or fail) before executing the next one.

If considering some multimaster setup, you should bear in mind that some Modbus-TCP slaves only support one TCP connection at a time. This may lead into TCP disconnect/reconnect overheads I’ve already hinted at.

I think truly, we’re trying to say that what you propose should work. It’d be worth doing some more diagnostic work before adding complexity to try and get around it.
At the very least, simplify to find out what works. Shorter lengths? Reduced poll rate?

There is a very long running thread by @ssalonen which includes how-to get extra debugging info

Wait … do you mean you are hitting the slaves from other Masters at the same time as OpenHAB?

Yes and no - it makes no difference. The modbus slave opens a new thread for each new socket communication.
%

My collegue Felix is doing most of the tests and he is also reading here but might not be registered yet. I will not be available until monday.

Sounds like you might be suffering from the issue explained in this post

What do you think? Do you get warnings logged because of large queue size?

Best,
Sami

Yes, I think this could be the problem. My collegue will provide more details later. On the raspberry there were errors logged about lost events or so and a size of 560 queue size, events or threads, I cannot remember exactly. Sometimes the system can catch up.and from this time on there is no delay anymore at least on the faster pc.
It seems that there are no counter values lost but the values arrive randomly at the item.

Please try out with the new version, for more details, see this post: Modbus openHAB2 binding available for alpha testing

Sami

Hey, so I can now provide some additional information.

The following applies to the RaspberryPi setup.
I did upgrade to the newest version of the binding (2.4.0.201806291551 that is, I hope).
I reduced the number of items to 1 per thing, just for now.
Started the Server with 120 things and items, letting it run for about 30 min.
Here is the logfile for these 30 min: openhab.log (json, google drive)

The first 20 items did get initialized, but the 100 others didn’t. The amount of tasks in the callbackThreadPool was at 2200.

CPU usage is at about 10-15%, Memory at about 20%
So that’s very different from what we experienced before, where the CPU usage would always go up to 350% to 400% but the callbackThreadPool would consist of 500-600 tasks, after 30 min.

I will do additional testing with the tips @rlkoshak provided and keep you updated.
Also there is gonna be testing on the number of items per poll affecting the behaviour.

Felix

@awiflx, I can confirm that you are using the latest binding.

The caching is mostly applied since the REFRESH commands come more than 50ms (the default) apart:

$ grep 'received refresh() but the cache is not applicable. Polling new data' ~/Downloads/openhab.txt|wc -l
201
$ grep 'cache was recent enough (age at most 50 ms). Reusing old response' ~/Downloads/openhab.txt|wc -l
37

Also, the overall number of REFRESH does not explain the large queue – only 201 poll tasks are queued this way

$ grep 'Scheduling one-off poll' ~/Downloads/openhab.txt|wc -l
201

So it must be something else :thinking:

I don’t see any log messages from the binding, are you sure you have configured the logging as explained in the logs: Enable verbose logging?

Please

  • ensure log configuratoin
  • also enable org.eclipse.smarthome.core.common.QueueingThreadPoolExecutor
  • share item and thing definitions (I can try to reproduce)

Best,
Sami

@ssalonen, allright i will try and go through that step by step.
First off, thank you for taking the time and being so patient about this.

I got the logging configured like this in the org.ops4j.pax.logging.cfg
(i did this through karaf with: log:set TRACE org.openhab.binding.modbus - all others according to this)

#Modbus logger
log4j2.logger.org_openhab_binding_modbus.level = TRACE
log4j2.logger.org_openhab_binding_modbus.name = org.openhab.binding.modbus
log4j2.logger.org_openhab_io_transport_modbus.level = TRACE
log4j2.logger.org_openhab_io_transport_modbus.name = org.openhab.io.transport.modbus
log4j2.logger.net_wimpi_modbus.level = TRACE
log4j2.logger.net_wimpi_modbus.name = net.wimpi.modbus
log4j2.logger.org_eclipse_smarthome_core_common_queueingthreadpoolexecutor.level = TRACE
log4j2.logger.org_eclipse_smarthome_core_common_queueingthreadpoolexecutor.name = org.eclipse.smarthome.core.common.QueueingThreadPoolExecutor

The definitions are like this:
dThing.json (10.0 KB)
dItem.json (14.1 KB)
and are created through a bash-script

I will now run the test again and post you the log file received.
Guess 10 to 15 min should be enough time?

Felix

Edit:
Tasks at ~900
CPU at 10-15%, Mem at 15-20%, Active threads at 5
Here is the log
openhab.log
And that’s what it looked like in PaperUIs Control

Thanks!

Few minutes should be enough!

Edited the post above, so everything is packed together

Thanks… still a myster though.

Can you paste here output of threads karaf console command?

like this?
threads.json (65.5 KB)

Thank you! I think I found the root cause – it’s a good old deadlock situation which halts all handling of the data, eventually resulting in increasing queue.

I actually managed to reproduce the issue, although not consistently (must be a timing thing)

I now pushed a fix, and it should be brewing here

1 Like

Ok, thank you big time!

I will get on testing the fix tomorrow and hit you up with anything I come up with.

Keep up the amazing work!

1 Like

So I finally did some testing on the fix you did. Performance is much better now, all Items get initialized and the CPU is getting load (Up to 70-80% which is running really smooth on the PI compared to what happened before).

What I did recognize after all is that some values still arrive “late”. There is a counter adding up once per second, and the poll getting the data once per second. So it should add up and be polled like it. But it seems the commands in queue get a little mixed up, so that there are some “old” values arriving after the “new” ones. Which is, I guess, not intended?

I will check if I can work around it by doing a more intelligent setup design next week, just wanted to state it out.
I am very grateful for what you did there and it really helps us out right now.

Wishing you a fantastic friday,
Felix

Thanks, awesome to hear we are heading the right direction!

Where is this counter? In case it is in openHAB, perhaps you need to adjust autoupdate parameter in openHAB?

For more info, consult these posts

From earlier convo, @awiflx has set up test counters incrementing in his Modbus-TCP slaves.

As I understand the Modbus binding, it works a strictly one-at-a-time process to query-response the slave(s).
Even if the outgoing query queue got jumbled, the responses would still appear to be in the correct sequence so far as the test counter goes.

So any jumbling of order of arrival ought to be on the OpenHAB event bus side?
Speculation - multithreaded task to process incoming data interrupted by next data arrival, second data may be put on event bus before first?

It’s possible the event bus queueing is not FIFO, nobody would normally notice or care much.