Modbus stops and polling times

DDS90 · January 9, 2021, 1:33pm

Hi!
I’m qiute new about OpenHAB world and this is my first post in this community.
I’ve installed openHABian v1.6.2 with Stable 3.0.0 version into a RaspberryPi4, 1GB RAM, 64GB SDXC.
I had already had a first approach with an old notebook with Ubuntu 20.04 and OpenHAB 2.5.5-1 and the same problems described below already occurred.

I’m trying to polling some (4 or more) modbus RTU devices with some (256 contacts and 30 analogs) registers to read.

At the beginning I’ve try to configure the *.things files for each slave like this:

Bridge modbus:serial:Scheda_IN_01 [port="/dev/ttyUSB0",baud=38400,id=1,dataBits=8,parity="none",stopBits="1.0",encoding="rtu",timeBetweenTransactionsMillis="0",updateUnchangedValuesEveryMillis="1000000" ] {
	Bridge poller S1_Digital_IN [ start=16, length=256, refresh=20, type="coil" ] {
		Thing data di_R_1 [ readStart="16", readValueType="bit", writeStart="16", writeValueType="bit", writeType="coil" ]
		Thing data di_R_2 [ readStart="17", readValueType="bit", writeStart="17", writeValueType="bit", writeType="coil" ]
		...
		...
		Thing data di_R_256 [ readStart="273", readValueType="bit", writeStart="273", writeValueType="bit", writeType="coil" ]
	}

	Bridge poller S1_Analog [ start=300, length=30, refresh=20, type="holding"] {
		Thing data A_IN_1  [ readStart="300", readValueType="int16", readTransform="JS(analog_to_percent.js)" ]
		Thing data A_IN_2  [ readStart="301", readValueType="int16", readTransform="JS(analog_to_percent.js)" ]
		...
		...
		Thing data A_IN_30  [ readStart="330", readValueType="int16", readTransform="JS(analog_to_percent.js)" ]
	}
}

but polling is too slow for my application (140-150 ms between the end of the slave response and the start of the master’s query) and also it stopped after less than 20 minutes, apparently without any logs errors.
OpenHAB via mbusd

I’ve try to change refresh,timeBetweenTransactionsMillis and set updateUnchangedValuesEveryMillis but nothing has changed.

I thought at the design of Modbus protocol limits around 120 registers in one poll (though 256 contacts are 16 registers and they are less than 120 registers), so i split the poll into 2 sections… but nothing has changed.

After reading several posts about modbus related issues, I’ve decided to install mbusd (Modbus TCP to Modbus RTU gateway) https://github.com/3cky/mbusd and replace the:

Bridge modbus:serial:Scheda_IN_01 [port="/dev/ttyUSB0",baud=38400,id=1,dataBits=8,parity="none",stopBits="1.0",encoding="rtu",timeBetweenTransactionsMillis="0",updateUnchangedValuesEveryMillis="1000000" ]

with:

Bridge modbus:tcp:Scheda_IN_01 [ host="localhost", port=502, id=1, receiveTimeoutMillis="100", timeBetweenTransactionsMillis="1", updateUnchangedValuesEveryMillis="1000000" ]

Now at least the comunication is up from 3 days, but it still too slow.

There isn’t any parameter to set for reduce this timeout?
And is there no way to make communication more fault rugged?
Maybe am I making some errors?
Or maybe is an hardware limit?

Thank you

DDS90 · January 9, 2021, 1:40pm

I doubt it is an hardware limit because I’ve freshly tried to polling the same devices with the same hardware using “Modbus Poll” and mbusd like a gateway and this is the result:
Poll

rossko57 · January 9, 2021, 3:18pm

That is not possible (breaks Modbus rules) and I don’t know what you’d get instead. Either allow to default (35mS) or set something realistic.

By default, binding TCP mode opens/closes TCP connection at each poll, which will come with an overhead.
More detail here -

Please be aware if you edit Thing parameters you may need to restart the binding package (or openHAB) to get the new values used cleanly.

Please use DEBUG as shown in the binding docs, choose which connection method you want to investigate, and show us some logs so we can see where the time goes.

DDS90 · January 11, 2021, 8:57pm

Thank you @rossko57 for your answere.

Sorry, this was a copy from an old file with an extreme desperate attempt, because I can enter any value, but the polling time does not decrease beyond 100 ms.

I’ve tryed to edit Thing parameters for the TCP bindng as per your suggestion:

Bridge modbus:tcp:Scheda_IN_01 [ host="localhost", port=502, id=1, receiveTimeoutMillis="20", timeBetweenTransactionsMillis="35", updateUnchangedValuesEveryMillis="1000000", reconnectAfterMillis ="60000" ] {

…always above 100 ms.

Assuming to maintain an RTU connection, which I would prefer, what can be the best and fastest way to query the status of the different inputs on the slaves?

In particular I have to maintain a certain WAF (Wife Acceptance Factor)…
The time that passes between pressing the button (trigger) and the activation of the rules (or outs, or scenarios) is too long, and makes you think you have not pressed the button well, so, she’s press again, deactivating the output, and she press again, and again, and again…

I have already noticed this (pesky) particular with RTU binding. Is possible restart only the modbus binding without having to completely restart openhab or the raspberryPI??

I’m tryng to enable the DEBUG, as soon as i have somethings, I’ll share with you.

Thank you again

rossko57 · January 11, 2021, 11:17pm

Alright, there’s not much more we can say about that without further info. The binding is capable of hitting TCP faster than that,but something else in the system may be throttling it.

You’ve seen the “performance management” post. Such things are always a balancing act amongst stuff that is important to you and stuff maybe less so, but subdividing brings its own overheads, etc. etc.

Set up your best guess at one device, and lets see what debug info you get.

Yes, if you can get a karaf console up.

Hmm. Even at best, in this sort of usage you might end up with a half-second delay getting an event to openHAB over Modbus. That’s noticeable at human scale. Then if it’s supposed to “do something” via rules there will be further delay.

I have an on/off lighting switching system myself, linked to openHAB via Modbus. But made the deliberate design decision that wallswitches clonk relays directly; fast response.
openHAB is added on top, listening in to events and able to exert auto controls e.g. motion activated, but not in the manual control path. Seconds delay matters little.

ssalonen · January 12, 2021, 5:55am

Is possible to configure the devices to have higher baud rate?

rossko57 · January 12, 2021, 11:29am

I’d actually advise against raising baud rate.

The problem here is not about data transfer rates, it’s about excessive time between consecutive polls. Obviously a poll frequency refresh of 20mS is nonsense, but as I understood it the binding should be queuing/discarding too-frequent poll requests and then banging them out as fast as it can. 150mS gap between polls is poor.

I suppose there is overhead generated there managing the queue - maybe backing off poll period refresh configured to 100mS for an experiment?

There is also -

which may or may not be related, we do know there are some lurking issues with serial in some versions.

ssalonen · January 12, 2021, 3:21pm

Btw, with trace level logging you do get summary on where the time goes. Search the TRACE level logs with “Modbus operation ended, timing info”.

You can configure the logging with

log:set TRACE org.openhab.core.io.transport.modbus.internal.ModbusManagerImpl

DDS90 · January 17, 2021, 10:08am

Half a second would already be acceptable.
Why is everything so slow? The CPU load of the RaspberryPI is always less than 3%. I honestly expected the whole system to be much more responsive.

DDS90 · January 17, 2021, 10:20am

Yes, it is possible to change the Baud Rate. Although I prefer not to increase it too much and risk that the system becomes unstable due to the length of the bus.
Considering that KNX works at 9600 (if I remember correctly) … at 38400 it seems more than good as a speed.
Furthermore, the total time of each question/answer is already less than 50ms. The delay is due to the time between consecutive polls, like said @rossko57

DDS90 · January 17, 2021, 10:42am

Strangely, after setting the TRACE log, the RPi become very very slow in all operations, but the modbus polling (RTU) has never stopped.
I tried to replicate the system with another ubuntu based pc and in a short time I got these two errors:

I hope they can be useful.

This is my setup:
http://gofile.me/6MpyZ/O2kOgru8K

ssalonen · January 17, 2021, 10:51am

@DDS90 The “Modbus polling stopping” is completely unrelated. This discussed more in Modbus Binding not working on OH3

Tl;dr it’s a regression in the serial library used by openhab.

I cannot download the log files (keeps on “connecting”), perhaps you could just grep the timing info?

Trace level logging indeed introduced additional cpu burden you can try setting trace level logging only for the relevant logger with the following command

log:set TRACE org.openhab.core.io.transport.modbus internal.ModbusManagerImpl

With some effort, you might be able to configure logging to output only the timing info (regex filter combined with trace level logging of one particular logger)

The timing info should confirm what you already seem to know… I.e. How much time goes into read/write activity with the slave and how long poller things are “waiting for their turn” to communicate with the device.

rossko57 · January 17, 2021, 1:35pm

It’s not just CPU load here, you’re adding i/o traffic to your storage medium.

DDS90 · January 18, 2021, 9:05pm

Sorry, Now I’ve upload here: Filebin | hflg5btt4n52tuju

I try as soon as I’ve a little free time. I hope to be able

DDS90 · January 18, 2021, 9:05pm

Correct, I hadn’t thought of that. In fact, the PC with SSD is much faster than the RPi with only SD

DDS90 · January 18, 2021, 9:48pm

I would like to inform you something that I was forgetting to specify.
Currently, when I press a button connected to a slave, the corresponding bit is set, then the master resets the bit via rule once it has been correctly read.
I have implemented this strategy to ensure that the pressure is detected.
Do you think this could be a good thing? Is there a better way to reset the bit?

rossko57 · January 19, 2021, 12:46am

From a quick look at openhab.log sample, you’ve got DEBUG set for everything so I think this going to impact performance just by amount of log generated.

Well, I don’t suppose you are pressing buttons very often, so it’s not going to amount to a great deal of additional traffic compared to your polling.

From ERRORE.txt -

2021-01-14 22:58:08.377 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Sent: 03 01 00 10 01 00 3d bd 
2021-01-14 22:58:08.379 [TRACE] [t.wimpi.modbus.io.ModbusRTUTransport] - Managed to read at least one byte
2021-01-14 22:58:08.396 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Response: 03 01 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c8 20

A normal transfer seems to take place promptly.
2mS after logging read poll sent, we begin to get data back.
Within 20mS we have the whole packet available, seems about right 10mS for wire transfer plus some system overhead.
Looks good in turnaround time.

2021-01-14 22:58:08.417 [DEBUG] [bus.handler.ModbusPollerThingHandler] - Thing modbus:poller:Scheda_IN_03:S3_Digital_IN received response PollResult(result=AsyncModbusReadResult(request = ModbusReadRequestBlueprint@3a934cac[slaveId=3,functionCode=READ_COILS,start=16,length=256,maxTries=3], bits = BitArray(bits=0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)))

Another 20mS to process bytes to 256 coil bits and pass up the chain, ready to update channels.

2021-01-14 22:58:08.493 [TRACE] [ernal.handler.ModbusDataThingHandler] - Channel switch will be updated to 'OFF' (type OnOffType). Input data: number value 0 (value type 'bit' taken into account) and bool value false. Transformation: <identity>

A further 85mS to update all those channels.
This is going to hammer your openHAB bus with Item updates,more on that in a minute.

2021-01-14 22:58:08.509 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Sent: 04 01 00 10 01 00 3c 0a

After about 130mS altogether, we’re ready to send off the next request to the serial modbus.
This timing seems to match up with the gaps you see in your polling traffic trace.
The bottleneck seems to be the volume of data to be processed.

Okay, so returning to Item updates - we realized a while ago that that generating Item updates on the openHAB event bus is costly in processing. Also can have knock-on effects by triggering rules, persistence records, etc.
It’s really easy with a non-trivial Modbus configuration like yours to generate thousands of Item updates per second - and all to an unchanged value.

@ssalonen provided the Modbus binding with a way to circumvent that, essentially update only changes.
Once again I’ll refer you to this thread -

and suggest looking hard at updateUnchangedValuesEveryMillis= parameter.

I don’t think this will make much difference to how much processing the binding actually has to do. But - I suspect it will greatly reduce the overheads on the rest of your system and so make available resources for faster polling.

ssalonen · January 19, 2021, 4:47am

Excellent analysis @rossko57 . Indeed, the polling has been intentionally coupled with rest of the openhab: if the system takes long time to handle the channel updates* then it also takes longer before binding pushes next channel update. In early versions modbus binding was throwing updates as fast it can but that does end well as the other side will not be able to keep up. Eventually you run out of memory and starve the rest of the openHAB.

The suggested parameter indeed should help here.

*(there can various reasons from this,e.g. persistence)

ssalonen · January 19, 2021, 4:56am

Sorry, the logs are not configured correctly for the transport bundle, it does not seem to have trace level logging enabled.

I see the documentation is outdated since the logger name changed in openHAB3 as the transport bundle was moved to core.

You can use the command above if needed.

For now I suggest following advice from @rossko57 as the next step.