Connection pooling in modbus binding

mbs38 · February 26, 2016, 11:59pm

@ssalonen still no improvement and still there’s this strange performance discrepancy between normal and debug mode:

openhab.cfg: http://pastebin.com/YvjU58j2
logback confs are unchanged.
logs, normal mode: http://pastebin.com/2hj6i6np
logs, debug mode: http://pastebin.com/xWWTzsbP

measured performance, normal mode: 182 transactions per minute
measured performance, debug mode: 1666 transactions per minute

Best regards,
M

ssalonen · February 27, 2016, 5:15am

@mbs38

EDIT 2: There was still bug with the delays with serial connedctions which basically meant that the delay (50ms) was not exercised in practice (from the logs you can see that Waited 0ms (interBorrowDelayMillis 50ms) before giving returning connection SerialConnection). Please try the updated version (v5).

If that does not work, can you please try larger interTransactionDelay (the last number that is currently 50ms).

50ms might be too fast for the device, at least it was too much for my siemens plc (which should be industrial grade) which is using modbus tcp (not serial!).

Edit: Small thing but can you please add enable net.wimpi.modbus INFO level in logback.xml. It just basically makes it clear if transactions eventually succeeded…

Ferryb4 · February 27, 2016, 10:24am

Hi Ssalonen,

It works with v5.
I set the delay to 500ms
Now I have no more errors.

Thank you very much for your time and energy.

mbs38 · February 27, 2016, 2:16pm

It didn’t work. With version 5 and this configuration for alI devices

modbus:poll=50
onnection=/dev/ttyS0:38400:8:none:1:rtu:280:80

I get a stable performance of 745 transactions per minute. Which sort of makes sense: 60*1000/80 = 750
I’ve added INFO level to the logback.xml. Not a single [ERROR] in the logs.
Although 745 is about four times better than the version openhab currently ships with, it’s still very slow. I’m convinced we can do better than that.

With 70ms transaction delay the performance goes below 300 transactions per minute.
Btw the difference between debug mode and normal mode is gone.

[quote=“ssalonen, post:65, topic:5246”]
50ms might be too fast for the device, at least it was too much for my siemens plc (which should be industrial grade) which is using modbus tcp (not serial!).
[/quote] That can’t be the problem here. The devices I’ve got here have achieved >100 transactions per second (!), when they were connected to a PLC. Of course it also depends on the amount of Registers/Bits that is requested by the PLC, but this still rules out that the problem is with the devices.

In order to show what is possible with a single device I’ve changed my configuration as follows:
modbus:poll=5
modbus:serial.agsBusMasterHolding.connection=/dev/ttyS0:38400:8:none:1:rtu:280:10
modbus:serial.agsBusMasterHolding.id=247
modbus:serial.agsBusMasterHolding.start=42
modbus:serial.agsBusMasterHolding.length=14
modbus:serial.agsBusMasterHolding.type=holding

I get a stable 1864 transactions per minute. Again no [ERROR] in the logs.
Just to clarify: 14 registers is by far the biggest chunk of data I’m requesting among all the devices I have configured. The bottleneck must therefore be somewhere else.
We should make multiple devices work just as well as a single device. ~1900 transactions per minute is what what we should be aiming for (at least ).

If you need any further debug logs with multiple and single devices please say so. I’m willing to help in any way I can.

Thanks for your work!
M

ssalonen · February 27, 2016, 3:48pm

Thank you for the cold hard facts. I’m really satisfied that the binding now at least “works” even though the performance is not yet maximised…

With 70ms transaction delay the performance goes below 300 transactions per minute.

If I understand correctly, you start to get the EOF error with delays of 70ms? Or any other idea why performance decreases?

Now that the basic functionality works, could you try to decrease the read timeout to lower value, say 50ms. I’m thinking It could improve the performance even though the first read might sometimes time out, perhaps it helps to just retry aggressively. Now the binding retries the transactions 3 times.

You could also try to lower transaction delay to the maximum performance you have witnessed, ie 10ms (100 transactions per second).

Looking forward how these tips work!

ssalonen · February 27, 2016, 3:53pm

Now properly understood you single device benchmark. Really illuminating.

Can you share trace level logs with the single slave back multiple slave? How many slaves you have exactly with multi slave configuration? I would like to investigate whether the extra loss is due to overhead cause by the frameworks etc I’m using … We talking about so large performance already…

It might also help that I remove the trace level logging statements, they do incur small overhead. I’ll have a look at this

mbs38 · February 27, 2016, 6:13pm

single device conf: just the six lines I’ve posted in my previous post (1 slave)
single device logs, debug: trace - Pastebin.com (1871 transactions / minute)

multiple device conf: conf multiple - Pastebin.com (13 slaves)
multiple device logs, debug: multipeltrace - Pastebin.com (730 transactions / minute)

The trace level logging doesn’t seem to hurt performance in any way.

I’ll try lowering the timeouts later.

M

mbs38 · March 2, 2016, 10:18am

Makes things worse. Long transactions, like reading 14 registers, never succeed.

M

ssalonen · March 2, 2016, 7:11pm

@mbs38

OK, to summarize the situation we have now the following (documenting for future generations as well… Perhaps we can snip out something for the wiki in the future as well)

Summary

1. single slave in modbus.cfg, low inter transaction delay (10ms), with generous read timeout (280ms), poll period practically minimal (5ms)

/dev/ttyS0:38400:8:none:1:rtu:280:10 (config) (as reported reported here and here)

performance good (logs), 1864 transactions per minute. Log snippet you pasted shows that on average we execute 1 transaction in 32ms (median 24ms), ie ~1875 transactions/minute[1]
Performance is slower than poll period 5ms, but that is quite expected and perfectly all right – the system reads as fast as it can in the sense that there are no “exra” waits between transactions.

2. Many slaves (13) in modbus.cfg, high inter transaction delay (80ms), with generous read timeout (280ms). Poll slower 50ms**

(config)
modbus:serial.agsBusMasterHolding.connection=/dev/ttyS0:38400:8:none:1:rtu:280:80 (as reported here)

performance poor, average 1 transaction in 80ms, ie ~750 transactions/minute (logs)
performance is poor because we artificially ask the system to wait between transactions, on average 43ms is waited after each transaction
no errors since the delays are really conservative

3. Many slaves (13) in modbus.cfg, high inter transaction delay (80ms), with tight read timeout (50ms). Poll slower 50ms**

(same as this config but with different read timeout)
(as reported here)

poor performance, EOF errors
since same timeout goes for small (short) reads, and long reads (long) reads [3] (e.g. 14 registers), we get timeouts.

[1] grepped timestamps with: grep "to allow delay between transactions" single_device.log|cut -d' ' -f1
[2] grepped average wait time using: grep "to allow delay between transactions" multi-device.txt|cut -d' ' -f8
[3] Here’s the corresponding place in code which makes the read block until 1) timeout occurs 2) all message bytes are read

Steps forward

Decreasing the read timeout was a bit poor advice, and definately that is not the issue since the single device configuration works extremely fast with read timeout of 280ms.

Based on the observations I think this might really work with multi device scenario (good general configuration instructions as well):

short poll period, e.g. 5ms. The binding will simply block until previous read is done, and immediately start the new poll round (checked this from openhab core code as well). If there is no requirement to update item status fast, this can be increased ofcourse.
generous read timeout, e.g. 280ms. When the line works, the read returns much faster. In case of errors (e.g. lost bit in transmission for whatever reason), the system waits 280ms before giving up. Default 500ms is pretty good default actually. With a lot of registers, the read timeout likely needs to be increased. If we get EOF errors, the value might be too low.
shorter inter transaction delay, e.g. 10ms. As has been proven by you in the single device scenario, even delay of ~30ms+some seems to work just fine with your device. For high performance requirements, this should be set as low as possible. If we get EOF errors, the value might be too low. This is likely hardest out of all parameters to tune in the sense that it depends on device performance as well.

Summarizing the above advice, can you try this kind of connection string with the multi-slave setup

 # - Short poll period (5ms).  Binding will simply block until previous read is done, and immediately start the new poll round
 #
modbus:poll=5
 # - Read timeout (280ms) needs to be enough for long queries to work out. When the line works, the read returns much faster.
 # - Short inter transaction delay (10ms). For high performance requirements, this should be set as low as possible. If we get EOF errors, the value is too low
 #
modbus:serial.X.connection=/dev/ttyS0:38400:8:none:1:rtu:280:10````


Best,
Sami

mbs38 · March 2, 2016, 9:35pm

@ssalonen the summary is a good idea.

I have just tried: http://pastebin.com/5qfiMYG0

There is again a huge performance difference between start.sh and start_debug.sh:
normal: 195 transactions/minute, logs: http://pastebin.com/uPniRYmC
debug: 530 transactions/minute, logs: http://pastebin.com/6kquYwjj

However, there is obviously no improvement - in fact the opposite has happened.

Best regards,
M

ssalonen · March 3, 2016, 6:13pm

Hi @mbs38

Can you try to increase the inter transaction delay, try first with 25ms (binding ensures that there is at least that duration time between transactions. 25ms worked with single slave as explained in the summary) and if you still get errors try even higher value e.g. 35ms.

I would expect it to work eventually, we need to find the smallest value that works. If the delay is too small some bits might be lost it seems and eventually we get EOF error. Performance is naturally decreased since each EOF error means that we have waited the read timeout…

Best
Ssalonen

mbs38 · March 3, 2016, 7:46pm

I have tried 25, 35, 45, 55, 65, 75, 85, 90. The performance is crap below 75 (Errors start to appear at 74). The best performance is at 75ms (783 transactions / minute), no errors. Higher transaction delays make the performance decrease again, the reasons are obvious.

There must be a bug or design error in the binding. There is no technical reason why multiple slaves shouldn’t work just as fast as a single slave.

M

ssalonen · March 3, 2016, 8:25pm

Thank you for trying the different delays! I truly appreciate this and would certainly hope that I could test the thing myself and not bother you…

I’m sorry the hear the performance is not yet there for you although I’m also happy to hear how much the situation has been improved compared to the official version, after all we do get throughput of 13 transactions per second!

But your point is quite valid, and definitely same performance should be expected with many slaves… I will try to investigate more.

Edit : could you please open up what kind of device(s) we are talking about? I noticed that you are using different ids.

ssalonen · March 7, 2016, 5:44am

Could you paste the logs for 75ms and 35ms inter transaction delay… I could compare the logs to the to single slave version…

EDIT Got new information on performance: I just got “virtual null modem” to work in my linux pc (see pymodbus for the nullmodem implementation). As modbus slave I used diagslave (./diagslave -m rtu -a 1 -b 38400 -d 8 -s 1 -p none -4 10 /dev/pts/7). Openhab acts as modbus master.

The openhab.cfg is here, items config here.

I have TRACE level logging enabled, and get average performance of 12ms (time between consecutive modbus responses, Response: xx yy ... log from ModbusRTUTransport). Most of the time (11ms out of 12ms) of that goes to the actual reading of the response from the server.

This is of course not really comparable to your situation where there is a real serial line, but shows that there is little overhead with the actual binding side. Compared to real life, the java serial binding openhab uses might not be as performant as it could be, for example.

@mbs38 If you can provide the logs I can find out where time goes with your specific case.

mbs38 · March 8, 2016, 3:22pm

@ssalonen:

13 slaves, settings: 50, 280, 35, start_debug.sh, performance 640
http://pastebin.com/9b8HC9Dn

13 slaves, settings: 50, 280, 35, start.sh, performance 224
http://pastebin.com/dxyNWxBz
(again huge performance difference between debug and non debug…)

13 slaves, settings: 50, 280, 75, start_debug.sh, performance 783
http://pastebin.com/YGqc252E

13 slaves, settings: 50, 280, 75, start.sh, performance 783
logs only contain state updates.

Thank you for trying to make this work better

ssalonen · March 8, 2016, 4:21pm

@mbs38 I can’t seem to find DEBUG log statements from net.wimpi.modbus. They show better the time that goes to IO.

Can you please configure your logging to include those and then modify your pastes?
Can you also please keep the poll 5ms so at least that is not introducing any artificial delays?
Can also repeat your one-slave test with the extended logging so I can compare the multi-slave against that.

Using those we can see how much time goes to reading of response. Thanks!

If these don’t give us more information of the root cause, we can try to make up a multi-slave configuration with all the openhab slave definitions querying the same slave id & FC & etc; similar to my expirement (config). It would be interesting to see whether there is any difference in performance then since the configurations should exactly 100% the same thing…

mbs38 · March 8, 2016, 6:18pm

I have added
to my logback_debug.xml and changed modbus:poll to 5.

start_debug.sh: http://pastebin.com/ZCcnTJkE
start.sh: http://pastebin.com/L4GQtGMZ

The performance is now a lot better in debug mode than in normal mode (760 vs 240).

ssalonen · March 8, 2016, 7:05pm

Hi @mbs38! Can you please paste you logback_debug.xml configuration? I’m still missing some log statements.

mbs38 · March 8, 2016, 7:08pm

Done. I think there is something peculiar going on here:

With this configuration: conf - Pastebin.com
Performance, start.sh: 1864, logs: no errors
Performance, start_debug.sh: 1864 logs: 19:29:39.915 [TRACE] [dbusSlaveConnectionFactoryImpl:108 ] - Waited 0ms (interB - Pastebin.com

With this configuration: modbus:poll=5#agsBus-Mastermodbus:serial.agsBusMasterHolding.connection=/dev - Pastebin.com
Performance, start.sh: 240 (!), logs: nondebug - Pastebin.com
Performance, start_debug.sh: 1870 (!) logs: 19:36:53.667 [TRACE] [dbusSlaveConnectionFactoryImpl:135 ] - Validating endpoin - Pastebin.com

How can we debug something that does work once you start debugging… -.-

logback_debug.xml: <configuration scan="true"> <appender name="STDOUT" class="ch.qos.logback.co - Pastebin.com
logback.xml: <configuration scan="true"> <appender name="STDOUT" class="ch.qos.logback.co - Pastebin.com

Best regards,
M

ssalonen · March 8, 2016, 8:11pm

Thank you so much @mbs38!

fixing logging

Regarding your logging configuration, can you please remove multiple definitions for the same logger. Now you have:

    <logger name="net.wimpi.modbus" level="DEBUG" />
    <logger name="net.wimpi.modbus" level="TRACE" />
    <logger name="net.wimpi.modbus" level="INFO" />

Leave just the TRACE:

    <logger name="net.wimpi.modbus" level="TRACE" />

This explain the missing log statements.

why debug and non-debug are different

(everything is a theory of course… unless proven otherwise )

Indeed, that feels weird but I believe the explanation is still the natural one: the delay introduced by all the logging makes the duration between transactions larger which again reduces the errors we get with the serial device. The difference between “non-debug” and “debug” configurations vanishes when the inter transaction delay is increased from the 10ms since then the modbus binding waits couple of milliseconds to ensure enough delay between transactions. It should be highlighted that the inter transaction delay does nothing if enough time has passed since the transaction – it only waits if previous transaction was just executed. Perhaps I should consider renaming the parameter…

As mentioned in the summary, best performance you got out with a single slave was ~24ms/transaction – even though the inter transaction delay parameter was set to 10ms. This means that due to read/write taking time, roughly 14ms goes just to that. In fact, some sort ideal performance was witnessed in my “virtual null modem” test, where IO took basically all the time (11ms to read modbus response, ~1ms to overhead in the binding to facilitiate next transactions).

Still, it beats to me why multiple slaves has different performance and different issues than single slave configuration – even though the time between transactions is the same (for example, 25ms). This is why I asked you to clarify the devices you are serving behind /dev/ttyS0. If the devices are different, the multi-slave might not be comparable with single-slave test.

Let me suggest next steps

1. please run single slave configuration with logging fixed (see above). Use this configuration:

modbus:poll=5
modbus:serial.slave1.connection=/dev/ttyS0:38400:8:none:1:rtu:280:35
modbus:serial.slave1.id=247
modbus:serial.slave1.start=42
modbus:serial.slave1.length=14
modbus:serial.slave1.type=holding

Do not have any rules in openhab. Have this item only, nothing else:

Number Slave1Item1 "slave1 (modbus) [%d]" {modbus="slave1:0"}

2. please run multiple slave configuration, with logging fixed, repeating same slave definition with different names

Use this configuration (13 identical slaves with 35ms inter transaction delay). Do not have any rules. Have these items only, nothing else:

Number Slave1Item1 "slave1 (modbus) [%d]" {modbus="slave1:0"}
Number Slave2Item1 "slave2 (modbus) [%d]" {modbus="slave2:0"}
Number Slave3Item1 "slave3 (modbus) [%d]" {modbus="slave3:0"}
Number Slave4Item1 "slave4 (modbus) [%d]" {modbus="slave4:0"}
Number Slave5Item1 "slave5 (modbus) [%d]" {modbus="slave5:0"}
Number Slave6Item1 "slave6 (modbus) [%d]" {modbus="slave6:0"}
Number Slave7Item1 "slave7 (modbus) [%d]" {modbus="slave7:0"}
Number Slave8Item1 "slave8 (modbus) [%d]" {modbus="slave8:0"}
Number Slave9Item1 "slave9 (modbus) [%d]" {modbus="slave9:0"}
Number Slave10Item1 "slave10 (modbus) [%d]" {modbus="slave10:0"}
Number Slave11Item1 "slave11 (modbus) [%d]" {modbus="slave11:0"}
Number Slave12Item1 "slave12 (modbus) [%d]" {modbus="slave12:0"}
Number Slave13Item1 "slave13 (modbus) [%d]" {modbus="slave13:0"}

In both cases, it is enough to store debug level logs only. Please let it run for 10secs at least so
we get good picture of the performance.

For the tests it is important that the slave definitions are identical in both single-slave and multi-slave configurations. This should make situations more comparable.

If you please do these steps we should have much much better understanding what is going on.