MODBUS Binding with SMA inverter missing lower byte

I think I understand what you’re saying - it’s not a magical multiple of 4 for length=, it’s just that the last register gets TCP fragmented.
(We’d still need to keep to multiple of 2 for 32-bit regs)
It does make more sense than magic lengths, really.

The device is behaving weird by fragmenting, but it is “allowed” to do that.

As I understood this, the binding should deal gracefully with fragmented packets. That’s really a function of the underlying host TCP stack, but I presume the binding should ask for TCP transfers in a certain way to get it.
Looks like this is not happening - I thought we’d previously validated all this, but maybe something got lost OH2 to 3.
This is out of my depth, I will summon @ssalonen for comment.

Thanks for your help digging into this, I suspect we might ask you for more yet!

In truth, we could workaround that. If length=6 or 8 can’t be used to shuffle the problem away, we can istead make another poller Thing that shuffles it away by starting at a different place. Nothing to stop us having pollers with overlapping register ranges, and data Things that pick from only one “version” of the same register.
That’s just a circumvention of course.

1 Like

Actually quite recently it was found out that TCP Modbus implementation in the underlying modbus library had a long-standing bug (since openHAB 1.x!!) related to tcp packet fragmentation. In certain situations some bytes are not read properly by the code, leading to CRC errors due to skipped bytes.

The fix has been already bene published, waiting for openHAB maintainers to approve and merge in: Changed read to readFully in TCP transports to get rid of fragmented packets by denis-ftc-denisov · Pull Request #11 · openhab/jamod · GitHub

2 Likes

Golly now everything makes sense :crazy_face:

I must have fantasized previously digging on fragmentation.
(We wouldn’t get CRC errors in “pure” Modbus-TCP, it passes unnoticed, that shows up only in ser2net use I think)

1 Like

Sure, no problem, I need to keep my brains working :joy:

Thank you Ssalonen, I will have a look at Github.

1 Like

Yeah I think seems to be the case based on what I have observed. But I can imagine that mileage might vary depending platform details (socket default of that particular OS)

Continuing the discussion from MODBUS Binding with SMA inverter missing lower byte:

I have done some additional tests to investigate the “missing lower byte” issue.

Test conditions:

  • To exclude a too high load as possible cause of problems, I used only one polling Thing (Current Power register 30775, length = 2, polling rate set to 60000 ms

  • Wireshark was used for data capturing with its “packets reassembly feature” turned off

  • For test 1 and test 3, Wireshark was running on the RaspBerry PI (host of OPENHAB), the display was redirected (PUTTY / XMING) to an INTEL-NUC Windows computer.

  • For test 2, Wireshark and the Radzio! Modbus Master Simulator were running on the INTEL-NUC Windows computer.

Tests performed:

  • SunnyBoy SB2500TLST-21 inverter queried from OPENHAB

  • SunnyBoy SB2500TLST-21 inverter queried from the Radzio! Modbus Master Simulator

  • SunnyBoy SB1.5-1VL-40 inverter queried from OPENHAB

Conclusion:

  • the behaviour of SunnyBoy SB2500TLST-21 inverter is causing the “Missing Byte” problem as already stated before by Rossko57
  • there is no defragmentation on the TCP/IP level

For details see test results below.

1. Testresult SunnyBoy SB2500TLST-21 inverter queried from OPENHAB

Response packet sent by SMA inverter in reply to the query.
Wireshark capture, packet length: 66 bytes

o	Network Interface:                          14 bytes
o	IP Header:	                                20 bytes
o	TCP Header:	                                20 bytes
o	Payload (MODBUS response message)           12 bytes
    	Transaction ID			2 bytes
    	Protocol ID				2 bytes
    	Length					2 bytes (Length=7)
    	Unit iD					1 byte
    	Function Code			1 byte
    	Byte Count				1 byte  (Byte Count = 4)
    	MSByte register 30775	1 byte
    	LSByte register 30775	1 byte
    	MSByte register 30776	1 byte  ( LSByte register 30776 missing !)

Wireshark reports:
Packet size limited during capture: Modbus truncated

The response packet is followed by a [PUSH] packet sent by the SMA inverter.
Wireshark capture, packet length: 60 bytes

o	Network interface:                            14 bytes
o	IP Header:                                    20 bytes
o	TCP Header:                                   20 bytes
o	Payload                                        6 bytes
    	LSByte register 30776		1 bytes
    	Padding bytes				5 bytes

The missing LSByte of register 30776 in the response message is not caused by defragmentation in the TCP/IP layers since:

  • The MF flag (More Fragments to come) in the IP Header of the response message is not set

  • The packet offset value in the IP Header of the [PUSH] packet is zero

It looks like OPENHAB is not processing the [PUSH] packet containing the LSByte of register 30776

Due to the lack of the LSByte the presentation of the actual current power is not a fluent line as shown below:

This missing LSByte is due to behavior of the SunnyBoy SB2500TLST-21.

I have no idea whether or not OPENHAB should be able to deal with this.

2.Testresult SunnyBoy SB2500TLST-21 inverter queried from the Radzio! Modbus Master Simulator

As with test 1, the LSByte of register 30776 is not contained in the response message but sent in a subsequent PUSH packet.

However , the Razio! Modbus Master Simulator includes the LSByte, received with the PUSH packet, in the result.

3. Testresult SunnyBoy SB1.5-1VL-40 inverter queried from OPENHAB

Response packet sent by SMA inverter in reply to the query:
Wireshark capture, packet length: 67 bytes

o	Network interface:                         14 bytes
o	IP Header:                                 20 bytes
o	TCP Header:                                20 bytes
o	Payload (MODBUS response message)          13 bytes
    	Transaction ID		   2 bytes
    	Protocol ID			   2 bytes
    	Length				   2 bytes (Length = 7)
    	Unit iD				   1 byte
    	Function Code		   1 byte
    	Byte Count			   1 byte  (Byte Count = 4)
    	MSByte register 30775  1 byte
    	LSByte register 30775  1 byte
    	MSByte register 30776  1 byte
    	LSByte register 30776  1 byte

All queried data is included in the response message.
The value are as expected , as is the presentation in OPENHAB.

1 Like

Typo :astonished:
The payload of the response message of test 3 must be 13 instead 12

Edit: Post updated accordingly

This SMA is a naughty box, isn’t it.
Well done pinning it down.

No reason why OH should, given the breaking of TCP fragmentation rules.

It could however; the Modbus byte count is present so we can detect when there’s “more to come”.
Presumably this is what Radzio does, just waits until byte count is satisfied.

Providing we can play nicely with however the host TCP layer is handling this stuff, we have only to wait and see if the rest turns up. This is way out of my pay grade about if the binding needs to interact with TCP to get the next “unexpected” data, or just sit and wait.
I guess the binding will get the padding as well, and need to discard it using the byte count.
As a dedicated diagnostic tool, Radzio may itself be taking some shortcuts that we should not follow in a shared production environment.

Obviously there would need to be a timeout in case the extra chunk never came.

It’s a bit of a decision about if we should be accommodating a “rule-breaking broken product”, which they’re in no hurry to fix, when there is a low-pain workaround which so far as I know always works.

We do need to make sure the binding still deals with “proper” fragmentation in other circumstances, again that depends on the interaction with host TCP layer. I think Sami is on top of that already.

1 Like

What an excellent analysis.

Even though it is not tcp fragmentation I would like you to test out transport bundle with this fix Upgrade jamod (to get rid of case with fragmented packets) by denis-ftc-denisov · Pull Request #2284 · openhab/openhab-core · GitHub

The fix was general in nature and actually might help here as well.

Edit: for what it’s worth I think what we have here is tcp segmentation, very similar to ip fragmentation but still different, and shows differently in wireshark. Difference between IP fragmentation and TCP segmentation - Routing & Switching - NetworkLessons.com Community Forum

I am quite sure that the fix mentioned above actually is the missing piece to the puzzle. Tcp is in the end “steam based” so application layer, such as modbus parsing in java, only see stream of bytes, not individual packets.

The fix actually just means we are not too inpatient and wait for the bytes we need to have. On most systems the OS tcp buffering conveniently shadowed this issue for a long time, as all the bytes of modbus packet appear at once for the application layer.

Hi Ssalonen,
Thanks for the compliment, I am happy to contribute to the forum.
After the weekend I will give the transport bundle fix a try .

Have a nice weekend,

Egon van Os, aka sonave

Testing in your situation would be very useful; it’s just possible that fixing OH to handle “proper” frag could mess up the current SMA workaround by throwing an error on a truncated message.

I did a randomly check on the Wireshark logfile and found that the delay between the response packet and the PUSH packet is approx 0.5 milliseconds.

Is this fix available in the OPENHAB 3.1 Milestone 3 ?
I have another PI available to do a fresh OPENHAB 3.1 Milestone 3 install.

Or should I test the fix on the OPENHAB 3.0 install ?

@sonave you can compare timestamps to know what is included in which milestone.

From github

@wborn wborn merged commit 2219705 into openhab:main 21 hours ago

Milestone 3 is 2 weeks old, and therefore does not contain these new fies.

You would have to install 3.1 snapshot of modbus transport bundle, probably you want snapshot of modbus binding as well to ensure compatibility.

It should be compatible with openhab 3.0 release to my knowledge.

@ssalonen
I have tried to test the updated transport bundle but I am not sure about my update procedure.
Here are the steps I carried out.

Step 1
A fresh install of OH 3.1.0 M3

Step 2
Before installing the MODBUS binding via Paper UI

openhab> bundle:list -s | grep transport.modbus

no instances found !

Step 3
Install MODBUS binding via Paper UI

openhab> bundle:list -s | grep transport.modbus

262 │ Active │  80 │ 5.2.1                   │ nrjavaserial
263 │ Active │  80 │ 3.7.2                   │ Apache Commons Net
264 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: Modbus Binding
265 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: E3DC Modbus Binding
266 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: HeliosEasyControls Binding
267 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: Modbus SBC Binding
268 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: StiebelEltron Bundle
269 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: Studer Binding
270 │ Active │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: SunSpec Bundle
271 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Configuration USB-Serial Discovery
272 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Configuration USB-Serial Discovery for Linux using sysfs
273 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Configuration Serial
274 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Modbus Transport
275 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport
276 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport for RXTX
277 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport for RFC2217 

Step 4
Uninstall Modbus Transport Bundle

openhab> bundle:uninstall 274
openhab> bundle:list

264 │ Waiting │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: Modbus Binding
265 │ Active  │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: E3DC Modbus Binding
266 │ Active  │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: HeliosEasyControls Binding
267 │ Active  │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: Modbus SBC Binding
268 │ Active  │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: StiebelEltron Bundle
269 │ Active  │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: Studer Binding
270 │ Active  │  80 │ 3.1.0.M3                │ openHAB Add-ons :: Bundles :: SunSpec Bundle
271 │ Active  │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Configuration USB-Serial Discovery
272 │ Active  │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Configuration USB-Serial Discovery for Linux using sysf
273 │ Active  │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Configuration Serial
275 │ Active  │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport
276 │ Active  │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport for RXTX
277 │ Active  │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport for RFC2217

Step 5
File org.openhab.core.io.transport.modbus-3.1.0-SNAPSHOT.jar copied to /usr/share/openhab/addons folder

sudo reboot

Step 6
On completion of the reboot

openhab> bundle:list

275 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport
276 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport for RXTX
277 │ Active │  80 │ 3.1.0.M3                │ openHAB Core :: Bundles :: Serial Transport for RFC2217
278 │ Active │  80 │ 3.1.0.202103161959      │ openHAB Core :: Bundles :: Modbus Transport

The ModbusTransport Bundle has been updated and is active

Step 7

Next I queried the SMA inverter (Current Power, register 30775, length=2, 32 bit signed integer)
Unfortunately the result is as before :woozy_face:

Where have I gone wrong ?

Hmm this is still too old,check the timestamp ( 20210316195)

Please try with jar from here Changed read to readFully in TCP transports to get rid of fragmented packets by denis-ftc-denisov · Pull Request #11 · openhab/jamod · GitHub

That’s the one I have used, I downloaded it one more time and repeated all steps.
Same time stamp, same test result

279 │ Active │ 80 │ 3.1.0.202103161959 │

Ah of course! It makes sense to have older timestamp as this self built by the developer who introduced the fix…

That’s a bummer then, it looks like the fix is not helping here :frowning:

Can you see the wrong bytes received if you enable verbose logging?

I had a look at the original fix again. It looks like while rtu-over-tcp has been fixed, a wrong method has been fixed with pure tcp… Additional fix needs to be introduced