Modbus errors when upgrading to OH3. Wierd

Symptoms:
I have approx 10 arduinos in house on modbus serial (rs485) to a RaspberryPi4
When installing OH3 + modbus-binding, only 1 Arduino works (id=11) … the rest CRCerrors and OEF ???
And everything works in OH2.5.9

Details:
(Wirescheme is not optimal (star-connection) so i get some errors approx 10% … but it’s OK. It is just for tempereture sensores, i read every minute so i dont mind.

SW:

Bridge modbus:serial:Arduino6 [port="/dev/ttyUSBarduino",baud=9600,id=6,dataBits=8,parity="none",stopBits="1.0",encoding="rtu",echo=false,receiveTimeoutMillis=5000, connectMaxTries=3, timeBetweenTransactionsMillis=100] {
	Bridge poller ArdId6 [ start=0, length=29, refresh=60006, maxTries=5, type="holding"] {
		Thing data ArdId6Pin5Temp [ readStart="5", readValueType="int16", readTransform="JS(divide10.js)" ] 
		Thing data ArdId6Pin5Hum [ readStart="17", readValueType="int16", readTransform="JS(divide10.js)" ] 
		//etc...
	}
}
Bridge modbus:serial:Arduino1 [port="/dev/ttyUSBarduino",baud=9600,id=11,dataBits=8,parity="none",stopBits="1.0",encoding="rtu",echo=false,receiveTimeoutMillis=5000, connectMaxTries=3, timeBetweenTransactionsMillis=100] {
	Bridge poller ArdWC01 [ start=0, length=29, refresh=25011, maxTries=5, type="holding"] {
		Thing data ArdId11Pin5Temp [ readStart="5", readValueType="int16", readTransform="JS(divide10.js)" ]
		Thing data ArdId11Pin5Hum [ readStart="17", readValueType="int16", readTransform="JS(divide10.js)" ] 
		//etc...
	}
}

In OH3 the Things get errors 100% (except id=11):

2021-04-05 12:48:15.704 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 2 out of 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=6, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=29, maxTries=5]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException CRC Error in received frame: 33 bytes: 06 03 3a 00 06 00 1c d8 e4 d8 e5 d8 ec c5 01 03 d8 f0 d8 e4 d8 e5 d8 00 31 d8 e2 d8 e2 d8 e2 75 22  [operation ID e47020e4-e64a-4a16-96cd-149bb57e5f14]

2021-04-05 12:48:31.983 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 4 out of 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=6, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=29, maxTries=5]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID e47020e4-e64a-4a16-96cd-149bb57e5f14]

2021-04-05 12:48:41.087 [INFO ] [openhab.event.ThingStatusInfoEvent  ] - Thing 'modbus:poller:Arduino6:ArdId6' updated: OFFLINE (COMMUNICATION_ERROR): Error with read: org.openhab.core.io.transport.modbus.internal.ModbusSlaveIOExceptionImpl: Modbus IO Error with cause=ModbusIOException, EOF=false, message='I/O exception: IOException CRC Error in received frame: 33 bytes: 06 03 3a 00 06 00 1c d8 e4 d8 e5 d8 ec c4 01 03 d8 f0 d8 e4 d8 e5 d8 00 31 d8 e2 d8 e2 d8 e2 a6 4d ', cause2=null

Usual suspect for CRCerror is Hardware wire - but everything works ok in OH2.5.10 so it MUST be software.

I tried the normal solutions/checks:

  1. symbolic link
	(sudo nano /etc/udev/rules.d/99-com.rules )
		SUBSYSTEM=="tty", ATTRS{idVendor}=="1a86", ATTRS{idProduct}=="7523", SYMLINK+="ttyUSBwavin"           
		SUBSYSTEM=="tty", ATTRS{idVendor}=="0403", ATTRS{idProduct}=="6001", ATTRS{serial}=="A65HRBTV", SYMLINK+="ttyUSBcomfoair" 
		SUBSYSTEM=="tty", ATTRS{idVendor}=="10c4", ATTRS{idProduct}=="ea60", SYMLINK+="ttyUSBarduino"
	sudo nano /etc/default/openhab
		EXTRA_JAVA_OPTS="-Xms250m -Xmx350m -Dgnu.io.rxtx.SerialPorts=/dev/ttyUSBcomfoair:/dev/ttyUSBwavin:/dev/ttyUSBarduino"
  1. Permissions (same in OH2 and OH3)
    ls -l /dev/ttyUSBarduino
    ==> lrwxrwxrwx 1 root root 7 Mar 22 21:42 /dev/ttyUSBarduino → ttyUSB0
    ls -l /dev/ttyUSB0
    ==> crw-rw---- 1 root dialout 188, 0 Apr 4 18:45 /dev/ttyUSB0

  2. java network permissions
    openHAB on Linux | openHAB

Can anyone help?

Yup.

Note that software changes can affect timings. RS485 problems can act consistently weird, and lead you down various garden paths. For example a poorly terminated network can do better or worse depending on the time between transfers.

Note also that software changes can report things it used to ignore.

Focus on your CRC errors, its almost certainly a flaky network.

I had to implement a CCTV PTZ in star (RS485 but not Modbus). Cheap chinese RS485 repeaters allowed subdivision into “legal” daisy chain segments without recabling, eliminating the star.

But i cant rip up the house, to make new cables…
And if things work just fine in OH2, it must be fixable in software, timings …
Also - if it is “just” hardware, the communication should work some of the time ?!

Any idea how to change timings?

I could make a second Raspberry pi, with OH2, and mqtt items to OH3 on another pi. (But it seems really silly)

Maybe that’s why I talked about segmenting with a repeater.

You could do some very basic stuff - what are your termination arrangements? Is there some active bias on the line?
You don’t have to do any of this stuff, of course.

Doesn’t it? You haven’t said. You did say it fails sometimes on OH2.5 too.

Everything accessible is described in the binding docs

Each Arduino has a max485-module + 120R + 10nF on each for dampening reflections

on OH2.5.9 about 80-90% of all reads are ok. So out of 5 tries, it almost allways succeds
Modbus is supposed to handle occational read failures.

On OH3 however, reads ALLWAYS fails. 100%
(except for one arduino (on the longest wire), that works 100%)

Timings - i tried adding 100ms delay … not different

I would suspect it has to do with the change in serial in OH3
https://github.com/openhab/openhab-core/pull/2272
I tried install .jar, but no difference

How many are there, ten?
How’s this capacitor connected, in series with the R or across the line? You don’t want any capacitors across a transmission line, that just makes it worse.
More than two 120R terminators is loading signal down.
You might find it works better taking some of these away (or up to 470R).
Is everything on a star or are some daisy chained?
Is anything supplying bias? (USB or RS232 adaptors usually do)

Possibly, the issue there is that error recovery is mishandled.
It is better of course not to be in permanent error recovery.

7 to be excact

Caps are in series with 120R, so that resistors doesnt drain power
I only do 9600baud, so it should be ok

  • that makes it possible to have Resistors on all 7 arduinos (a compromise, since i have star connection)

yes, everything is star
They USB from raspberry is supplying power

But modbus is BUILD to accept some communication loss now and then, and just read again
(I also have a Wavin floorheating on modbus (on another USBport). Only 1 slave here, only 1m cable. And it also has some errors, once in a while)

Interesting! - how?

Don’t fool yourself here, Modbus is intended to work error-free; but of course tries to be tolerant to the kind of sporadic rubbish you’d get operating in a shipyard or factory.
Yes, in real life we lay cheap cables badly, don’t pay attention to shielding, use vacuum cleaners nearby etc. and rely on recovery - but really you should expect nothing but trouble from a 10-20% error rate, that’s truly dreadful :smiley:

Have to say I don’t think I’d do much different to make the best of what you’ve got.
You might remove the caps from the two 120R at ends of longest cables so that you have some conventional terminators.
Might be worth the experiment making them all 470R.

There’s not much more to do with a star apart from segmenting it properly.
USB adaptors are so cheap it’s probably cheaper to use several of those in a powered hub giving multiple host serial ports, instead of specialist repeaters.

I wouldn’t rule out dirty behaviour from Arduino libraries, slaves treading on others messages, you would need a scope really to identify that kind of issue.

As per the PR you read, the serial library has trouble with disconnect/reconnect the serial port.
The very specific symptom is “can’t connect” after a while - which you haven’t reported at all.
That reconnect gets invoked when error recovery takes place, so if you are laden with CRC problems you ought to be more prone to it, if the theory is correct.
But it may be more evident on some host architecture than others.

For background the same serial library is in use at OH 2.5 and 3.0, although the 3 is later version to support Java11. The Modbus binding has changed little.

Do you ever see timeout errors? From openHABs viewpoint, that should be the consequence of transmissions getting trashed, i.e. no response from slaves who don’t receive a valid message.
If you only get inbound CRC it is significant,but we must also remember that OH transmissions are generally short compared to incoming poll responses.

EDIT - afterthoughts.
While the ideal UART samples data bits around the middle of bit time, they don’t all behave nice. Sometimes there’s fixed timing and it can be improved by changing line speed. 9600 seems a sensible enough baud rate, but counter-intuitively it’s worth a try at higher 19200. That would also change any ringing effects in your star. I realize that would be a total pain to set all Arduinos just for the experiment.

Another little trick for poor UARTs is 1.5 stopbit time, to force resynch of each byte.

wow … thanks for all the HW ideas!
(yes, the 120R+10nF was the trick to make things work (almost) like a charm on OH2)

to my mind, this MUST be solved on SW
(because it works fine on OH2, because of many slaves, and no obvious/simple HW solution)

stopBits=“1.5” doesnt change things

I do get very different errors (depending on how/when i turn on/off modbus-things-file, and power.)

After reboot:

  • EOF
  • endpoint
  • permissions
[ERROR] [rt.modbus.internal.ModbusManagerImpl] - Last try 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=12, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=29, maxTries=5]). Aborting. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID 96e82809-aa47-4929-be1f-d48e47f6224f]

[ERROR] [rt.modbus.internal.ModbusManagerImpl] - Last try 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=12, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=29, maxTries=5]). Aborting. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID 96e82809-aa47-4929-be1f-d48e47f6224f]

 [ERROR] [ing.ModbusSlaveConnectionFactoryImpl] - re-connect reached max tries 3, throwing last error: Could not get port identifier, maybe insufficient permissions. null. Connection SerialConnection [m_SerialPort=null, m_Parameters.getPortName()=/dev/ttyUSBarduino]. Endpoint ModbusSerialSlaveEndpoint [getPortName()=/dev/ttyUSBarduino]

poweroff/on:

  • conflicting parameters
  • CRC

[INFO ] [openhab.event.ThingStatusInfoEvent ] - Thing 'modbus:serial:Arduinos11' updated: OFFLINE (CONFIGURATION_ERROR): Endpoint 'ModbusSerialSlaveEndpoint [getPortName()=/dev/ttyUSBarduino]' has conflicting parameters: parameters of this thing (modbus:serial:Arduinos11 'Modbus Serial Slave') are different from some other thing's parameter. Ensure that all endpoints pointing to serial port '/dev/ttyUSBarduino' have same parameters.

To turn a phrase, it fails like a drain on OH2.
You can argue that the older serial transport was more tolerant, the root cause is your high error rate.

You would need to do that in your Arduinos to affect how they transmit.

Are you going to fix that? It tells you what to do.

This is the known serial problem where error recovery eventually messes up and the port locks.

conflicting parameters - fixed

on OH3 Back to:

  • one arduino working almost 100%
  • rest arduinos working 0% (mainly CRC + some EOF)

But they keep polling, and getting almost same response in numbers
(except 1-2 numbers shifting, because tempereture-sensors fluctuate on decimals:

0c 03 3a 00 0c 00 1b d8 e4 d8 e5 d8 ec de d8 ec d3 00 d7 00 d6 d8 f0 d8 e4 d8 e5 d8 00 03 e7 03 e7 00 bf d8 e2 d8 e2 d8 e2
0c 03 3a 00 0c 00 1b d8 e4 d8 e5 d8 ec de d8 ec d3 00 d7 00 d6 d8 f0 d8 e4 d8 e5 d8 00 03 e7 03 e7 00 bf d8 e2 d8 e2 d8 e2
0c 03 3a 00 0c 00 1b d8 e4 d8 e5 d8 ec de d8 ec d3 00 d7 00 d6 d8 f0 d8 e4 d8 e5 d8 00 03 e7 03 e7 00 bf d8 e2 d8 e2 d8 e2 
0c 03 3a 00 0c 00 1b d8 e4 d8 e5 d8 ec de d8 ec d3 00 d6 00 d6 d8 f0 d8 e4 d8 e5 d8 00 03 e7 03 e7 00 c0 d8 e2 d8 e2 d8 e2 
0c 03 3a 00 0c 00 1b d8 e4 d8 e5 d8 ec de d8 ec d4 00 d6 00 d6 d8 f0 d8 e4 d8 e5 d8 00 03 e7 03 e7 00 bf d8 e2 d8 e2 d8 e2

How can they consistently read almost the same numbers from Arduinos, and still 100% conclude CRC?

Do you know what kind CRC-checksum OH3-binding uses?
to check with: https://crccalc.com/

Probably it’s the end of the transmission messing up. Most serial implementations timeout after a period of no data, and say “reception complete”. Noise on the line ruins that.
This might be the area that differs a little from OH2 to OH3.

Yes, it’s the one specified by the Modbus specification. Hopefully the same as your Arduinos?

I don’t think you have visibility of the CRC bytes via DEBUG.

I did a simple test, and bypassed the star:
(not connected to all the other lines/arduinos)
So raspberry is connected, through 1 single line, to 1 arduino.

as i suspected: Same result: 100% CRC errors

Hence it cannot be a HW problem!

Arduino software is build upon following libraries:

#include <ModbusRtu.h>
#include <Wire.h>       //DHT:
#include <dht.h>
#include <OneWire.h>    //OneWire, temp:
#include <DallasTemperature.h>

You’ve also proved that OH3 works just fine by your standards with one of the Arduinos.

If you really want to nail this, you will need a line analyzer.

I will bet a beer you would find a glitch at the end of every Arduino transmission, probably as the Tx is disabled.

no, actually I proved that OH3-binding doesnt work.
since it cant even handle the simplest, correct HW-setup:
a line between raspberry and arduino

Okay.

RS485 problems are surprisingly slippery things, confounding logic.

Anyway, now that you’ve proved that OH3 is totally broken (but only for you), what are you going to do now?

Researched again. I am not the only one, who has problem with OH3-binding :

https://community.openhab.org/t/modbus-binding-not-working-on-oh3/111619
https://github.com/openhab/openhab-core/pull/2272

I have a zillion small ideas/tweaks. But no smoking gun
(thanks for your very interesting, educationg points about modbus)
But for noew. it is sooooo much easier to turn to planB:

  • use OH2, that works like a charm (at least for modbus)
  • wait for updated modbus binding on OH3.1 stable release
    I see they just installed a fix in snapshot version (not in 3.1.M3 that i tried)

The fix is only to workaround deadlock situation that might happen when serial port is closed and reopened . I would not expect it to help with crc and other serial errors.

1 Like

“I’ve noticed a number of owners of car model XYZ have fixed their flat tyre problem by changing the wheel. My XYZ is on fire, so I’ve ordered a spare wheel. Hope it arrives soon!”

I tried the snapshot yesterday, and yes: no succes. (still CRCerrors)

ssalonen - perhaps you have an idea, about where the problem is?

1.Raspberry4
2.CP2102 6 in 1 Multi-functional Serial Module Adapter
3.SINGLE cat5e cable. A+B in twisted pair (no star!)
4. max485module (terminated with 120ohm)
5. Arduino

And this setup works in OH2