Connection pooling in modbus binding

Yeah I think that’s it. Also: the three seconds I talked about is a bit different thing, it is associated with reads from the socket stream (SO_TIMEOUT) and I think there’s really no need for user to override that.

By using alternative way of creating the socket, specifying the connection timeout parameter (instead of OS default) should resolve the issue.

Would you like to create a feature request for this in the github?

Best
Sami

OH and github newbie… I don´t know how to create a feature request, would you mind openning it and send me the link so I can learn seeing what you have done?

@Botura: Posted the feature request here: https://github.com/openhab/openhab/issues/4595

Thank you Sami!

While we’re on the topic … some thoughts about retry behaviour ? Which also applies in the “missing slave” situation.
When I last checked, if retries were enabled, the binding will wait on all the try + retry timeout periods before polling other slaves.
After one poll, we do it all again, affecting performance badly.

“Nicer” behaviour would be to poll other slaves before retrying the failed one.
Basically, that would change the concept of retry to mean ‘number of failed polls before we declare that device broken’.

What exactly we do about a broken slave is not at all clear to me. Can we detect a broken slave via an OH rule? How could we take action to reduce the impact on survivors?
I fear there probably isn’t much that can be done easily, without restructuring - which should be an OH2 target. Just see that retries=0 and use the poll as the ‘retry’ mechanism?

My application has modbus slaves scattered around a large site with local power supplies, any of which could be switched off for maintenance or of course some fault. I haven’t really figured out how to deal with off-line slaves at all.

I think you did a great assesment there @rossko57. I would also agree that this kind of restructuring fits nicely into openhab2 changes… For now I would indeed go for the approach where retries are minimized. Still, I do understand the delay you will eventually see if there are many openhab slaves pointing to the same physical modbus slave.

Indeed there are many approaches of handling this nicely and in a flexible manner (user has control) but still with reasonable defaults (good performance by default). Would it help if the modbus requests are grouped by endpoint (e.g. ip/port), and then executed. If request would fail due to connection error or IO error, the other requests pointing to the same endpoints would not be tried out before next poll cycle.

I have tried to get feedback on the openhab2 implementation of modbus in the somehwat old pull request but so far no luck. We try to get to a situation where both openhab versions can be maintained with relative ease. There are plans on splitting the modbus binding to usable “transport” bundles, that other bindings could use. This would mean that other bindings could re-use basic modbus protocol to implement “higher level protocols”. Currently people are forced to fork the modbus binding to implement some manufacturer specific edition of modbus (I have seen examples of this in the forum). Unfortunately this kind of split requires quite much planning and expertise and certainly might affect these “robustness/performance requirements” we are discussing here.

Can we detect a broken slave via an OH rule?

Yes, actually now we can now. I’ve implemented the feature request from @mbs38:

So you can react to Undefined states in rules – basically telling there is something wrong with the slave. Please consult the wiki for details.

This got me thinking something quite hackish… what if we would use Config Admin Binding to modify the modbus configuration automatically in case of slave errors? Perhaps that would solve your immediate need?

There would be small benefit there, but I suspect not enough to justify the effort for most users. Given a means to manage underlying problems in a more complex environment, not necessary?

Damn, I thought I’d seen that somewhere and then couldn’t find it again. That does indeed allow sensible management.

Not sure about full auto - you can’t please everybody :slight_smile:
Never used Config Admin, not sure what is possible - could we say use a Rule to redirect a Modbus slave to some imaginary endpoint to cause a zero-time failure during polling? (rather than slave deletion)
Periodically a Rule can reset the faulty slave to its correct path for a retry/recovery - under our control.

Or maybe, much less complex, have a means to set individual slaves to less frequent polls - 1 in a 100, 1 in a 1000 ? But that is something new.

(I am shying away from OH2 for now, due the ‘beta’ status)

Yeah I was actually meaning handling the thing with rules like you purposed.

Did a small test and it seems to be possible. If you put too large port number, it will instantly fail (port number too large error from java). Other option is to connect localhost to non responsing port and you would get connection refused.

With config admin you need to define one item for each connection parameter and rest of the logic would be in rules.

We could also consider introducing ‘enabled’ parameter for slaves for easy enable/disable from rules but not sure if this is so common case that it is really necessary…

Would you like to give it a try? We could document it in wiki.

Yes I will have a play around with rules. Don’t have ready access to mess about with live hardware, so will take a while, but no hurry I guess.

I like the enable/disable option actually - with the wide variety of possible Modbus devices it may have more uses than just error management.

Maybe implement as variable poll rate - set a binding master poll period, then set individual slave rates as fractions - zero effectively disabled, 1 for standard polling (default), 2 for half rate, 10 for one-tenth rate etc.

Hi ssalonen

I try to read from my Solar inverter via Modbus TCP.

In windows i get no errors but on my rpi get a lot of errors?
Do you have any idea?

My config is http://pastebin.com/xhpsZcsv

and my logfile is http://pastebin.com/KWvFZukV

Well connection refused indicates that you might have an connection issues unrelated to modbus. Basically server does not accept the client connection.

What does the following output in linux:
nc -v 192.168.1.106 502

That basically just tries to establish connection to the server using netcat.

Please also ensure that there are no other connections to the modbus server (slave) since some of the servers can only serve single client at a time.

I think that enable/disable option and individual pool rate would be great enhancements. These and the timeout issue that ssalonen is working on, it should cover almost every situation.

Hi

My OH2b4 with modbus binding is not working. Log-level is set to TRACE in ops4j.pax.

Starting OH:

2016-10-29 21:29:34.867 [DEBUG] [org.openhab.binding.modbus          ] - BundleEvent STARTING - org.openhab.binding.modbus
2016-10-29 21:29:34.872 [DEBUG] [ding.modbus.internal.ModbusActivator] - Modbus binding has been started.
2016-10-29 21:29:34.877 [DEBUG] [org.openhab.binding.modbus          ] - BundleEvent STARTED - org.openhab.binding.modbus
2016-10-29 21:29:34.888 [DEBUG] [org.openhab.binding.modbus          ] - ServiceEvent REGISTERED - {org.openhab.model.item.binding.BindingConfigReader, org.openhab.binding.modbus.ModbusBindingProvider}={component.name=org.openhab.binding.modbus.genericbindingprovider, component.id=183, service.id=314, service.bundleid=200, service.scope=bundle} - org.openhab.binding.modbus
2016-10-29 21:29:34.891 [DEBUG] [org.openhab.binding.modbus          ] - ServiceEvent REGISTERED - {org.osgi.service.cm.ManagedService, org.osgi.service.event.EventHandler}={event.topics=openhab/*, service.pid=org.openhab.modbus, component.name=org.openhab.binding.modbus, component.id=182, service.id=315, service.bundleid=200, service.scope=bundle} - org.openhab.binding.modbus
2016-10-29 21:29:35.132 [DEBUG] [inding.modbus.internal.ModbusBinding] - modbusSlave 'slave1' instanciated
2016-10-29 21:29:35.840 [DEBUG] [inding.modbus.internal.ModbusBinding] - Parsed the following slave->endpoint configurations: {slave1=EndpointPoolConfiguration@10bb419[passivateBorrowMinMillis=35,interConnectDelayMillis=0,connectMaxTries=3,reconnectAfterMillis=-1,connectTimeoutMillis=0]}. If the endpoint is same, connections are shared between the instances.
2016-10-29 21:29:35.855 [DEBUG] [inding.modbus.internal.ModbusBinding] - Parsed the following pool configurations: {ModbusSerialSlaveEndpoint@11ddbf2[portName=/dev/ttyUSB0]=EndpointPoolConfiguration@10bb419[passivateBorrowMinMillis=35,interConnectDelayMillis=0,connectMaxTries=3,reconnectAfterMillis=-1,connectTimeoutMillis=0]}
2016-10-29 21:29:35.861 [DEBUG] [inding.modbus.internal.ModbusBinding] - config looked good

having 1 serial unit defined in service.item


poll=1000
serial.slave1.connection=/dev/ttyUSB0:9600:8:none:1:rtu
serial.slave1.id=1
serial.slave1.start=0
serial.slave1.length=5
serial.slave1.type=holding
serial.slave1.valuetype=int16

I captured the voltage on the rs485 with osciloscope. My slave sends las bit-answer in about 12ms after initial request by OH.

I get stuff like all seconds in openhab.log

2016-10-29 21:25:09.429 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Waited 0ms (passivateBorrowMinMillis 35ms) before giving returning connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0], to ensure delay between transactions.
2016-10-29 21:25:09.431 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true
2016-10-29 21:25:09.432 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): borrowing connection (got SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0]) for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] took 3 ms
2016-10-29 21:25:09.464 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true
2016-10-29 21:25:09.466 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Passivating connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]...
2016-10-29 21:25:09.467 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - (passivate) Connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] (endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]) age (226536ms) is below the reconnectAfterMillis (-1ms) limit. Keep the connection open.
2016-10-29 21:25:09.468 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - ...Passivated connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]
2016-10-29 21:25:09.469 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): returned connection for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]

How must the binding be configured to log the request in hex and see the answer. It looks as the answer is valid (CRC Check is good), but OH is not parsing…?

Binding was loaded over Paper UI - binding-modbus - 1.9.0.SNAPSHOT.

Any help is highly appreciated.

having done a restart of OH2, the stream went into timeout:

2016-10-29 21:46:04.736 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - awaited 12 bytes, but received 11
2016-10-29 21:46:04.737 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - Last request: 01 03 00 00 00 05 85 c9
2016-10-29 21:46:04.739 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - failed to read: CRC Error in received frame: 12 bytes: 01 03 0a 00 00 00 00 00 08 70 16 30 

You ought to start a new thread for your problem.

Yup, you got bad data back from the slave. Simplify while fault finding, have a slave length of just one to read a single register. Try another utility to see if that can read the slave.

1 Like

I implemented the simpleModbus library in Arduino. It’s working with the Radzio! Modbus Master Simulator for 1 or even 10 holding registers, starting at 0. No error there. I used the serial settings from Radzio for the settings in OpenHAB binding.

I cannot allocate the fault. Sometimes i got EOF, I rised the timeout to 350ms at 9600 bps. Then the EOF is still there:

2016-10-30 09:40:14.351 [ERROR] [pi.modbus.io.ModbusSerialTransaction] - execute try 2/3 error: I/O exception - failed to read. Request: net.wimpi.modbus.msg.ReadMultipleRegistersRequest@d09f09 (unit id 1 & transaction 45411). Serial parameters: SerialParameters@146bbb8[portName=/dev/ttyUSB0,baudRate=9600,flowControlIn=none,flowControlOut=none,databits=8,stopbits=1,parity=none,encoding=rtu,echo=false,receiveTimeoutMillis=350]
2016-10-30 09:40:14.767 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - Last request: 01 03 00 00 00 05 85 c9
2016-10-30 09:40:14.768 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - failed to read: Error reading response (EOF)
2016-10-30 09:40:14.771 [ERROR] [pi.modbus.io.ModbusSerialTransaction] - execute try 3/3 error: I/O exception - failed to read. Request: net.wimpi.modbus.msg.ReadMultipleRegistersRequest@d09f09 (unit id 1 & transaction 45411). Serial parameters: SerialParameters@146bbb8[portName=/dev/ttyUSB0,baudRate=9600,flowControlIn=none,flowControlOut=none,databits=8,stopbits=1,parity=none,encoding=rtu,echo=false,receiveTimeoutMillis=350]
2016-10-30 09:40:14.775 [ERROR] [pi.modbus.io.ModbusSerialTransaction] - execute reached max tries 3, throwing last error: I/O exception - failed to read. Request: net.wimpi.modbus.msg.ReadMultipleRegistersRequest@d09f09. Serial parameters: SerialParameters@146bbb8[portName=/dev/ttyUSB0,baudRate=9600,flowControlIn=none,flowControlOut=none,databits=8,stopbits=1,parity=none,encoding=rtu,echo=false,receiveTimeoutMillis=350]
2016-10-30 09:40:14.780 [ERROR] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): Error getting modbus data for request net.wimpi.modbus.msg.ReadMultipleRegistersRequest@d09f09. Error: I/O exception - failed to read. Endpoint ModbusSerialSlaveEndpoint@14537fc[portName=/dev/ttyUSB0]. Connection: SerialConnection@147df19[portName=/dev/ttyUSB0,port=/dev/ttyUSB0]

What do you get other times? It is not clear if you are still getting CRC fails

sometimes partly of the message:

2016-10-30 21:11:55.073 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - Last request: 01 03 00 00 00 05 85 c9
2016-10-30 21:11:55.074 [ERROR] [t.wimpi.modbus.io.ModbusRTUTransport] - failed to read: CRC Error in received frame: 13 bytes: 01 03 0a 00 00 00 00 00 08 08 7a 15 d6 
2016-10-30 21:11:55.076 [ERROR] [pi.modbus.io.ModbusSerialTransaction] - execute try 1/3 error: I/O exception - failed to read. Request: net.wimpi.modbus.msg.ReadMultipleRegistersRequest@b905a1 (unit id 1 & transaction 19626). Serial parameters: SerialParameters@146bbb8[portName=/dev/ttyUSB0,baudRate=9600,flowControlIn=none,flowControlOut=none,databits=8,stopbits=1,parity=none,encoding=rtu,echo=false,receiveTimeoutMillis=350]

and sometimes no error message:

2016-10-29 21:22:04.129 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): borrowing connection (got SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0]) for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] took 2 ms
2016-10-29 21:22:04.162 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true
2016-10-29 21:22:04.163 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Passivating connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]...
2016-10-29 21:22:04.163 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - (passivate) Connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] (endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]) age (41232ms) is below the reconnectAfterMillis (-1ms) limit. Keep the connection open.
2016-10-29 21:22:04.164 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - ...Passivated connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]
2016-10-29 21:22:04.165 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): returned connection for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]
2016-10-29 21:22:05.166 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Waited 0ms (passivateBorrowMinMillis 35ms) before giving returning connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0], to ensure delay between transactions.
2016-10-29 21:22:05.167 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true
2016-10-29 21:22:05.167 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): borrowing connection (got SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0]) for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] took 1 ms
2016-10-29 21:22:05.199 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true
2016-10-29 21:22:05.199 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Passivating connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]...
2016-10-29 21:22:05.200 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - (passivate) Connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] (endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]) age (42269ms) is below the reconnectAfterMillis (-1ms) limit. Keep the connection open.
2016-10-29 21:22:05.200 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - ...Passivated connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]
2016-10-29 21:22:05.201 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): returned connection for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0]
2016-10-29 21:22:06.202 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Waited 0ms (passivateBorrowMinMillis 35ms) before giving returning connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0], to ensure delay between transactions.
2016-10-29 21:22:06.203 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true
2016-10-29 21:22:06.204 [TRACE] [.binding.modbus.internal.ModbusSlave] - ModbusSlave (slave1): borrowing connection (got SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0]) for endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] took 3 ms
2016-10-29 21:22:06.237 [TRACE] [ing.ModbusSlaveConnectionFactoryImpl] - Validating endpoint ModbusSerialSlaveEndpoint@1269b63[portName=/dev/ttyUSB0] connection SerialConnection@13f57cd[portName=/dev/ttyUSB0,port=/dev/ttyUSB0] -> true

looks like depending on timeout. using 50:350 in modbus.cfg

whatever I configure - no data are parsed in openhab binding, i.e. no value are shown in sitemap.

Is the binary expecting a ‘\n’ (EOL) or EOF bin-value at the end of the message?

[quote=“markuszelg, post:197, topic:5246”]
Is the binary expecting a ‘\n’ (EOL) or EOF bin-value at the end of the message?[/quote]
No, Modbus protocol is clearly specified and there is no EOL/EOF in it.[quote=“markuszelg, post:197, topic:5246”]
failed to read: CRC Error in received frame: 13 bytes: 01 03 0a 00 00 00 00 00 08 08 7a 15 d6[/quote]
That is indeed not a valid Modbus response, it is truncated and missing two CRC bytes from the end. However, I don’t know if the binding would normally report those - I reckon so, but maybe @ssalonen would confirm?

Hi

I hope you have found the configuration guide in wiki?

I would keep receiveTimeoutMillis as is, since usually the binding is able to read the response in 1500ms (default). With very long responses (reading over ten registers at once) you might need to increase that.

As the wiki says, most important parameter is the interTransactionDelayMillis. I recommend trying a larger value such as 100ms, just to make sure if this the root cause for you. As you may know, in modbus rtu, the duration of the “silent period” is significant. The parameter default should be enough according to spec, and it has worked for some users (in this thread) but I would not be surprised if that sometimes needs adjustment. For example, in my use case the modbus tcp slave cannot handle the speed and I needed to artificially limit the request rate by using this parameter.

Once you have tried out these, could you please provide the verbose logs (in pastebin.com), relevant item configurations and binding configurations?

Best
Sami

EDIT: note that having a large receiveTimeoutMillis will not make the binding any slower as long as there is no errors. Also note that you are guaranteed to get errors by reducing the timeout as the slave would not have enough time to communicate the data.