Modbus errors when upgrading to OH3. Wierd

Any suggestions @rossko57 for troubleshooting?

Darn serial, we always seem to go very low-level with this one :slight_smile: Line level understanding would be probably the best insight into the topic. Is that what you meant with “old laptop as serial monitor”?

Based on javacomm serial library documentation (which is actually not used by OH but I guess the API is similar enough), we have “Note, however, that framing errors may cause the Timeout and Threshold values to complete prematurely without raising an exception.” – This is happening when we get “awaited 4 bytes, but received 2” log message. Does this give more hints?

I guess the socat suggestion might reveal something but then again it might hide the issues as socat would sit “between” the real serial port and openHAB.

I could try to make a special version of binding which tries to read bytes again if we timeout and not get all bytes at first time. It seems unlikely this would fix the issue though…

I don’t have a good enough understanding of what goes on between serial device and binding.
Something (the USB adaptor that I assume is in use) gets a byte at a time, something decides the bytes have stopped coming so “here’s a packet” and in our case it’s short.

A framing error pretty much means the stop bit was missed at the expected time, very low level error.
Poor bit timing by the transmitter or receiver can cause framing errors, and that could be sensitive to last data bit sent. That’d be the most significant bit, used as sign bit in signed integers…

I would expect bytes to be discarded at low level, but maybe not. That would leave the library to decide to pass along garbage or not, if it bothers checking for the framing error flag.

Can we tell if there really is a framing error at the hardware or do intervening libraries hide that?

Maybe @pete3 is able to try some read polls that include a larger proportion of bytes over hex80 value?

I think we are heading way over my skills here … But i will try to assist, if you need me to…
But i might need very specific linux terminal commands

  1. Socat. I installed. But have problems. input/output files are empty.
    Here are my commands:
sudo systemctl stop openhab
export PATH=/etc/openhab/modpoll/arm-linux-gnueabihf:$PATH
cd /tmp
socat /dev/ttyUSBarduino,raw,echo=0 SYSTEM:'tee input.txt | socat - "PTY,link=/dev/ttyUSBarduino,raw,echo=0,waitslave" | tee output.txt'
--> 2021/04/22 18:15:52 socat[2118] E unlink("/dev/ttyUSBarduino"): Permission denied

sudo socat /dev/ttyUSBarduino,raw,echo=0 SYSTEM:'tee input.txt | socat - "PTY,link=/dev/ttyUSBarduino,raw,echo=0,waitslave" | tee output.txt'
modpoll -b 9600 -a 7 -p none -c 4 -t 3 -1 /dev/ttyUSBarduino

3.I need very specific instructions, to carry that out.
I have mac + windows10.
and usb-interfaces: FTDI232 or CP2102

Hmm … can you make “programatically” inject the bytes from serialport, to see if openhab “alone” has similar behavior? That would also help isolating the problem.

Hmm… it’s possible in a way

openhab-addons/DEVELOPERS.md at main · openhab/openhab-addons · GitHub describes how to create a “null modem emulator”, creating two virtual serial ports, one for reading, one for writing.

One can then utilise diagslave (virtual modbus slave) and modpoll (modbus maste) to fill up the registers with the data you want to have. Pointing the openHAB to the other end of the null modem emulator allows you to read data from this local diagslave modbus serial slave

Many things are probably different to real slave and real serial communications… not sure how well it gives insights to the issues

EDIT: I tested this myself. Openhab can read the -10004=0xD8EC just fine over this “null modem emulator”, diagslave acting as modbus rtu slave. The issue can be something platform specific (I have 64bit linux x86_64) or related to something more lower level

@pete3 could I ask you to test modbus transport from [modbus] Modbus TCP robust against fragmentation/segmentation and more clear error message when all bytes are not received with Modbus/RTU by ssalonen · Pull Request #2315 · openhab/openhab-core · GitHub (see the download link)?

It will not fix your issue but instead of DEBUG level log message ("awaited 4 bytes, but received 2") & issueing CRC errors, error more loudly logged about the missing bytes.

This should help people in the future as well

1 Like

I am now on OH3.0.2

  • I uninstalled modbus binding
  • put the jar in /usr/share/openhab/addons,
    but get errors:
2021-04-27 08:57:50.587 [INFO ] [org.apache.felix.fileinstall        ] - Installing bundle org.openhab.core.io.transport.modbus / 3.1.0.202104240230
2021-04-27 08:57:50.774 [WARN ] [org.apache.felix.fileinstall        ] - Error while starting bundle: file:/usr/share/openhab/addons/org.openhab.core.io.transport.modbus-3.1.0-SNAPSHOT.jar
org.osgi.framework.BundleException: Could not resolve module: org.openhab.core.io.transport.modbus [234]
  Unresolved requirement: Import-Package: gnu.io; version="[3.12.0,6.0.0)"

Should i upgrade to 3.1.M3?

gnu.io refers to the serial support of openHAB. You can install it e.g. by feature:install openhab-transport-serial. I would expect then the modbus transport installation to work

that made it work:

Tried with same arduino script as above:

with -10024:

2021-04-29 13:27:38.879 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Sent: 07 03 00 00 00 04 44 6f 
2021-04-29 13:27:38.900 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Response (CRC OK): 07 03 08 00 01 00 02 00 03 d8 d8 48 05 
2021-04-29 13:28:09.049 [INFO ] [openhab.event.ThingStatusInfoEvent  ] - Thing 'modbus:data:Arduino9:poll9:ArdId9Pin3' updated: ONLINE
2021-04-29 13:28:09.051 [INFO ] [openhab.event.ItemStateEvent        ] - Item 'item3' updated to -10024

with -10004:

2021-04-29 13:39:14.639 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Sent: 07 03 00 00 00 04 44 6f 
2021-04-29 13:39:19.862 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Last request: 07 03 00 00 00 04 44 6f
2021-04-29 13:39:19.865 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - failed to read: getResponse serial port exception
2021-04-29 13:39:19.867 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 1 out of 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=7, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=4, maxTries=5]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException getResponse serial port exception [operation ID 73a108fe-0d7f-41cc-9932-5629a2ead15e]

with disconnected

2021-04-29 13:41:06.260 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Sent: 07 03 00 00 00 04 44 6f 
2021-04-29 13:41:11.266 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - Last request: 07 03 00 00 00 04 44 6f
2021-04-29 13:41:11.268 [DEBUG] [t.wimpi.modbus.io.ModbusRTUTransport] - failed to read: Error reading response (EOF)
2021-04-29 13:41:11.270 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 1 out of 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=7, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=4, maxTries=5]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID f4d77860-2621-4739-aa63-74a11ce7c231]

Thank you! The error message is quite obfuscated, not as intended. Something to fix still :sweat_smile:

1 Like

Hi, just upgrade from 2.5.9 to 3 and now on snapshot to fix the locking problem (Thanks @ssalonen ) and am wondering if my CRC error issue is related to this subject. (On 2.5.9 had occasionally 1 of 3 try failed but never 2nd or 3rd).

Using a PI4, serial USB interface (Cheap Chinese), CAT5 cable with co-located 3xEastron SDM630, 2x SDM120 and about 10m away a WP8028ADAM. The SDM’s poll at 60,000ms and the WP8028 at 500ms. All ModBUS timings are OpenHAB default. The SDM630 & WP8028ADAM have zero errors. Both SDM120 however both have equal amount of excessive CRC resulting in approx. 100 OFFLINE (Est. 5% of the poll’s) events per 24hours with error as follows:

2021-05-26 01:47:06.886 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 1 out of 3 failed when executing request (ModbusReadRequestBlueprint [slaveId=53, functionCode=READ_INPUT_REGISTERS, start=0, length=80, maxTries=3]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException CRC Error in received frame. 163 bytes of payload (35 04 a0 43 69 99 9a 00 00 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 00 00 00 00 41 c8 cc cd 00 00 00 00 00 00 00 00 41 cf 8d ae 00 00 00 00 00 00 00 00 40 d0 00 00 00 00 00 00 00 00 00 00 3f 77 c6 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 42 47 cc cd 45 29 9d 3f 00 00 00 00 43 35 15 81 43 04 e6 e9 ) with invalid CRC of  [operation ID 091861e7-472d-49e7-b8b2-b18c83bf6653]
2021-05-26 01:47:07.286 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 2 out of 3 failed when executing request (ModbusReadRequestBlueprint [slaveId=53, functionCode=READ_INPUT_REGISTERS, start=0, length=80, maxTries=3]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException CRC Error in received frame. 163 bytes of payload (35 04 a0 43 69 99 9a 00 00 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 00 00 00 00 41 c8 cc cd 00 00 00 00 00 00 00 00 41 cf 8d ae 00 00 00 00 00 00 00 00 40 d0 00 00 00 00 00 00 00 00 00 00 3f 77 c6 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 42 47 cc cd 45 29 9d 3f 00 00 00 00 43 35 15 81 43 04 e6 e9 ) with invalid CRC of  [operation ID 091861e7-472d-49e7-b8b2-b18c83bf6653]
2021-05-26 01:47:07.689 [ERROR] [rt.modbus.internal.ModbusManagerImpl] - Last try 3 failed when executing request (ModbusReadRequestBlueprint [slaveId=53, functionCode=READ_INPUT_REGISTERS, start=0, length=80, maxTries=3]). Aborting. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException CRC Error in received frame. 163 bytes of payload (35 04 a0 43 69 99 9a 00 00 00 00 00 00 00 00 3e 00 00 00 00 00 00 00 00 00 00 00 41 c8 cc cd 00 00 00 00 00 00 00 00 41 cf 8d ae 00 00 00 00 00 00 00 00 40 d0 00 00 00 00 00 00 00 00 00 00 3f 77 c6 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 00 42 48 00 00 45 29 9d 3f 00 00 00 00 43 35 15 81 43 04 e6 e9 ) with invalid CRC of  [operation ID 091861e7-472d-49e7-b8b2-b18c83bf6653]
20

If my issue is related and I can assist let me know.

There are actually small differences in the data read in each message. Can’t read too much into that - may indicate noise on the line, may just be that source data has actually changed in a few milliseconds.

But it does allow for a noise problem, particularly as we’re obviously in amongst power cabling here. Different devices may be more or less sensitive about noise, different transfer lengths give more or less opportunity for noise or wandering bitrate timing. All boils down to “not easy to pin down.”

You’ve not mentioned termination. The USB adaptor probably has 120R built in, and the other devices none, that often suffices for short runs. Multi slaves and 10m runs is getting beyond “short” though, I would recommend a remote terminator as well.

Ultimately you might need to use shielded cable around power devices.

Or of course the SDM120 might just be pants. It’s not unheard of for any given design to be a bit ropy at a particular line speed, a change to another baud rate can help (but is a pain because everyone has to be changed)

The USB adapter has indeed a terminator, we tried with and without a 120R at the end but no difference. The cable is however unshielded so let’s see if we can improve that.

Did find one remark googling in regards to the SDM120 unrelated to OH where they changed to a SDM230 due to CRC errors so that would be an other option.

In any case thanks for your feedback and we go back to the cable to see if we can find some improvement there.

@ssalonen Hi, is this resolved in build 2447 snapshot 3.20? i have the same problems upgrading from 2.5 and tried also moving to snapshot tree, but keep having errors.
thanks
CP2102, SDM120 and 20 cm of cable.

@execcr this is a long thread with multiple things discussed. What issue are you referring specifically?

Sorry, didn’t want to hijack this thread. I was tired when I wrote, please excuse me.
I’m referring this pull request: [modbus] Workaround for nrjavaserial issues: do not disconnect serial by ssalonen · Pull Request #2272 · openhab/openhab-core · GitHub

I get excessive retry on modbus. I’ve only a SDM120 to reader in the same electrical panel. With OH 2.5 was good, with OH 3.2 snaphost (latest as I write) i had to cut the cable to 20 cm and slow to 2400 bps to no get the poller offline.

That workaround is indeed in 3.1.0 release. You need to have the updated openHAB (core), simple binding update is not enough.

The workaround is not helping to reduce errors actually, so you can indeed still get errors. The workaround was introduced to avoid getting “stuck” and side-step a deadlock issue in the serial library used by openhab (Serial ports getting blocked after some re-connecting · Issue #1842 · openhab/openhab-core · GitHub). This deadlock issue was a regression introduced in openhab3

This real root cause might get eventually fixed in future versions of openHAB, after which we can remove the workaround.

What is reason for the retry? What is the error message?

I get this kind of messages:

2021-04-05 12:48:31.983 [WARN ] [rt.modbus.internal.ModbusManagerImpl] - Try 4 out of 5 failed when executing request (ModbusReadRequestBlueprint [slaveId=6, functionCode=READ_MULTIPLE_REGISTERS, start=0, length=29, maxTries=5]). Will try again soon. Error was I/O error, so reseting the connection. Error details: net.wimpi.modbus.ModbusIOException I/O exception: IOException Error reading response (EOF) [operation ID e47020e4-e64a-4a16-96cd-149bb57e5f14]

(I took it from the first post because I can’t get it). But I didn’t see any message in the log telling about CRC error, only EOF errors.

Now, it works @2400bps with openhab snapshot 3.2 and I shortened (from 32 cm to about 15 cm) the cable and redoing the connections. Modbus termination is done on the transceiver side. The cable is not shielded due to be very short,

The previous connection worked without any problem with Open Hab 2.5
Thank you

As you can read above, OH2.5 allowed some real errors to pass unnoticed.
OH3 should have both better error detection and reporting.

Let us know when you can share a real message of yours, other people’s aren’t much help.

1 Like

If you are refering to me, i did send all available info.
(I have no news, since i am stuck for now on 3.1.M4, since another vital experimental binding for me (zmartify), hasnt yet been upgraded to OH3.1 (to changed dependensies)