MODBUS Binding with SMA inverter missing lower byte

Hi,

I started to use Openhab 3 a few weeks ago and have been able to capture data from my SMA SB2500 and SMA SB1500 inverters using the MODBUS binding.
It all seems to work except for the accuracy of the data from the SMA SB2500 inverter.

System : Raspberry PI 4, openHAB version: OH3*

The delta of Current-Power from the SMA SB2500 is a multiple of 256, it looks like the LSByte is missing.

SMA SB2500 data

When I query the SMA SB2500 from a Modbus Master Simulator the received data is OK.

I have Wireshark running on the Raspberry PI and captured following data:

Acccording to the logging the content of register 30776 = 0x0491
I noticed that the TCP has been reassembled form

  • frame 209389, payload 12 bytes and
  • frame 209391, payload 1 byte which is the LSByte of register 30776

I am not an experienced wireshark user but i believe that the reassemble is done bij Wireshark for logging purposes and has to be done as well by the receiver.

Every now and the the result seem to be ok, see picture below and attached
event.logging and openhab.logging

Any help in more in depth logging at receiver side is much appreciated !

Kind Regards,

Egon

1 Like

Mostly Modbus works in 16-bit words, so that would be surprising.
Could you actually give some numbers, got this, expected that?
I don’t really know what you’re pointing to here.

My immediate suspicion is reading a float value as an integer, don’t think that’s whay is going on though.

Great! But what did you have configured, what were the results?

I think you’re saying the TCP from the slave is fragmented. That’s unusual, but normal TCP business.
Any reassembly of fragmented packets is low-level and up to your host, before the packet gets passed to openHAB.

We don’t know what “ok” is, do you mean non-zero? Trying to reconcile this with “multiple of 256”

Assuming that, looking at the binding trace log (I hates screenshots!), at that time there’s a what 50% success rate?
It should be easy to compare the ‘zero reads’ with more of your wireshark logs to see what was sent.

However … reviewing previous Sunny Boy threads, I am reminded that some of them at least are very badly behaved, and only send correct data if you poll with length=4 or a multiple.
I’d revise your secret poller Things with that, and restart the binding to ensure only most recent changes in use.

Hi Rossko57,

First of all thanks for your response.
I will come back to you later today and I will avoid screenshots :wink:

sonave

Hi rossko57,

I first tried you’re suggestion to increase the poll length to 4 and I am happy to say that I get correct readings now.

However I think that there is an underlaying “problem”.

The current power of the SB2500 is 1150W.

A query of register 30775 with length=2 gives following result (wireshark):

  • register 30775 : 0x0000
  • register 30776 : 0x037e (=1150W)

A query of register 30775 with length=2 gives following result (wireshark)

  • register 30775 : 0x0000
  • register 30776 : 0x037e
  • register 30777 : 0x0000 (duplicate of 30775)
  • register 30778 : 0x037e (duplicate of 30776)

If the last byte of queried data “gets lost” then

  • in case of lengths= 2, register 30776 will be affected:
    register 30775 : 0x0000
    register 30776 : 0x0300 (=1024W)

  • in case of lengths= 4, register 30778 will be affected:
    register 30775 : 0x0000
    register 30776 : 0x037e (=1150W)
    register 30777 : 0x0000
    register 30778 : 0x0300 (=1024W)

The binding uses registers 30775, 30776 which are not affected now by the “lost byte problem”

Kind Regards,

sonave

typo: the second query of the length of 30775 has length=4 :astonished:

What makes you think these are duplicate? If the Sunny Boy is behaving, which we think it does when length=4, then this is a different register with different meanings. What do the makers claim to show in 30777/8

Well, does it or doesn’t it? You can easily make another data Thing and examine the contents of this register too.

previous post

values 0x037e should be 0x047e

  • register 30775 : 0x0000
  • register 30776 : 0x047e (=1150W)

Duplicate of 30775

According to SMA_Modbus-TI-de-15 | Version 1.5 :

30775/30776 Wirkleistung über alle Außenleiter, in W S32
30777/30778 Wirkleistung über Außenleiter L1, in W S32
30779/30780 Wirkleistung über Außenleiter L2, in W S32
30781/30781 Wirkleistung über Außenleiter L3, in W S32

Since I have a single inverter the values in 30775/30776 are identical to the values in 30777/30778.
Ok, not a duplicate … English is not my native language !

Last byte gets lost

Modbus Master Simulator query register of 30777 with length=2

  • register 30777 : 0x0000
  • register 30778 : 0x047e (=1150W)
    Wireshark:
  • register 30777 : 0x0000
  • register 30778 : 0x047e (=1150W)
    Current Power SB2500 thing:
  • reading 1024 = 0x400

I think that using a length of 4 instead of 2 is a bypass, not a solution.
I could be happy since it “solves” my problem.
As a retired senior system test engineer sitting back is not an option.

To be continued !

Yes. The SMA device is not behaving nicely.

I continued the investigation of the “lost byte problem”.

My power plant has 2 inverters:

    • SB 1.5-1VL-40
    • SB2500TLST-21

The “lost byte problem” shows up with the SB2500TLST-21 and not with the SB 1.5-1VL-40.

Without going into to much details:

SB 1.5-1VL-40

If register 30577 is queried with length = 2, all 13 bytes are transmitted in one MODBUS/TCP RESPONSE frame

    • Transaction ID 2 bytes
    • Protocol ID 2 bytes
    • Length 2 bytes
    • Unit ID 1 byte
    • Function code 1 byte
    • Byte count 1 byte
    • Register 30775 2 bytes
    • Register 30776 2 bytes

SB2500TLST-21

If register 30577 is queried with length = 2, the first 12 bytes are transmitted in a TCP [ACK] frame

    • Transaction ID 2bytes
    • Protocol ID 2 bytes
    • Length 2 bytes
    • Unit ID 1 byte
    • Function code 1 byte
    • Byte count 1 byte
    • Register 30775 2 bytes
    • Register 30776 MSByte

And the remaining byte is transmitted in the MODBUS/TCP RESPONSE frame

  • Register 30776 LSByte

SB2500TLST-21

If register 30577 is queried with length = 4, the first 16 bytes are transmitted in a TCP [ACK] frame:

    • Transaction ID 2bytes
    • Protocol ID 2 bytes
    • Length 2 bytes
    • Unit ID 1 byte
    • Function code 1 byte
    • Byte count 1 byte
    • Register 30775 2 bytes
    • Register 30776 2 bytes
    • Register 30777 2 bytes
    • Register 30778 MSByte

and the remaining 1byte is transmitted in the MODBUS/TCP RESPONSE frame:

    • Register 30778 LSByte

The result from register 30775 / 30776 is as expected when querying with length = 4.

There is, however, a serious downside of extending the query length.

Instead of register 30776, now register 30778 is affected.

Register 30777 / 30778 hold the Inverters current line 1 power.

This value inside OPENHAB may drop down, but will be restored during the next query of the current line 1 power, there will be just a glitch !

But what if it applies to a register that holds data used to determine whether your shutters should be open or closed ?

At a polling frequency of 1 minute ………. imagine for yourself :grimacing:

I think I understand what you’re saying - it’s not a magical multiple of 4 for length=, it’s just that the last register gets TCP fragmented.
(We’d still need to keep to multiple of 2 for 32-bit regs)
It does make more sense than magic lengths, really.

The device is behaving weird by fragmenting, but it is “allowed” to do that.

As I understood this, the binding should deal gracefully with fragmented packets. That’s really a function of the underlying host TCP stack, but I presume the binding should ask for TCP transfers in a certain way to get it.
Looks like this is not happening - I thought we’d previously validated all this, but maybe something got lost OH2 to 3.
This is out of my depth, I will summon @ssalonen for comment.

Thanks for your help digging into this, I suspect we might ask you for more yet!

In truth, we could workaround that. If length=6 or 8 can’t be used to shuffle the problem away, we can istead make another poller Thing that shuffles it away by starting at a different place. Nothing to stop us having pollers with overlapping register ranges, and data Things that pick from only one “version” of the same register.
That’s just a circumvention of course.

1 Like

Actually quite recently it was found out that TCP Modbus implementation in the underlying modbus library had a long-standing bug (since openHAB 1.x!!) related to tcp packet fragmentation. In certain situations some bytes are not read properly by the code, leading to CRC errors due to skipped bytes.

The fix has been already bene published, waiting for openHAB maintainers to approve and merge in: Changed read to readFully in TCP transports to get rid of fragmented packets by denis-ftc-denisov · Pull Request #11 · openhab/jamod · GitHub

2 Likes

Golly now everything makes sense :crazy_face:

I must have fantasized previously digging on fragmentation.
(We wouldn’t get CRC errors in “pure” Modbus-TCP, it passes unnoticed, that shows up only in ser2net use I think)

1 Like

Sure, no problem, I need to keep my brains working :joy:

Thank you Ssalonen, I will have a look at Github.

1 Like

Yeah I think seems to be the case based on what I have observed. But I can imagine that mileage might vary depending platform details (socket default of that particular OS)

Continuing the discussion from MODBUS Binding with SMA inverter missing lower byte:

I have done some additional tests to investigate the “missing lower byte” issue.

Test conditions:

  • To exclude a too high load as possible cause of problems, I used only one polling Thing (Current Power register 30775, length = 2, polling rate set to 60000 ms

  • Wireshark was used for data capturing with its “packets reassembly feature” turned off

  • For test 1 and test 3, Wireshark was running on the RaspBerry PI (host of OPENHAB), the display was redirected (PUTTY / XMING) to an INTEL-NUC Windows computer.

  • For test 2, Wireshark and the Radzio! Modbus Master Simulator were running on the INTEL-NUC Windows computer.

Tests performed:

  • SunnyBoy SB2500TLST-21 inverter queried from OPENHAB

  • SunnyBoy SB2500TLST-21 inverter queried from the Radzio! Modbus Master Simulator

  • SunnyBoy SB1.5-1VL-40 inverter queried from OPENHAB

Conclusion:

  • the behaviour of SunnyBoy SB2500TLST-21 inverter is causing the “Missing Byte” problem as already stated before by Rossko57
  • there is no defragmentation on the TCP/IP level

For details see test results below.

1. Testresult SunnyBoy SB2500TLST-21 inverter queried from OPENHAB

Response packet sent by SMA inverter in reply to the query.
Wireshark capture, packet length: 66 bytes

o	Network Interface:                          14 bytes
o	IP Header:	                                20 bytes
o	TCP Header:	                                20 bytes
o	Payload (MODBUS response message)           12 bytes
    	Transaction ID			2 bytes
    	Protocol ID				2 bytes
    	Length					2 bytes (Length=7)
    	Unit iD					1 byte
    	Function Code			1 byte
    	Byte Count				1 byte  (Byte Count = 4)
    	MSByte register 30775	1 byte
    	LSByte register 30775	1 byte
    	MSByte register 30776	1 byte  ( LSByte register 30776 missing !)

Wireshark reports:
Packet size limited during capture: Modbus truncated

The response packet is followed by a [PUSH] packet sent by the SMA inverter.
Wireshark capture, packet length: 60 bytes

o	Network interface:                            14 bytes
o	IP Header:                                    20 bytes
o	TCP Header:                                   20 bytes
o	Payload                                        6 bytes
    	LSByte register 30776		1 bytes
    	Padding bytes				5 bytes

The missing LSByte of register 30776 in the response message is not caused by defragmentation in the TCP/IP layers since:

  • The MF flag (More Fragments to come) in the IP Header of the response message is not set

  • The packet offset value in the IP Header of the [PUSH] packet is zero

It looks like OPENHAB is not processing the [PUSH] packet containing the LSByte of register 30776

Due to the lack of the LSByte the presentation of the actual current power is not a fluent line as shown below:

This missing LSByte is due to behavior of the SunnyBoy SB2500TLST-21.

I have no idea whether or not OPENHAB should be able to deal with this.

2.Testresult SunnyBoy SB2500TLST-21 inverter queried from the Radzio! Modbus Master Simulator

As with test 1, the LSByte of register 30776 is not contained in the response message but sent in a subsequent PUSH packet.

However , the Razio! Modbus Master Simulator includes the LSByte, received with the PUSH packet, in the result.

3. Testresult SunnyBoy SB1.5-1VL-40 inverter queried from OPENHAB

Response packet sent by SMA inverter in reply to the query:
Wireshark capture, packet length: 67 bytes

o	Network interface:                         14 bytes
o	IP Header:                                 20 bytes
o	TCP Header:                                20 bytes
o	Payload (MODBUS response message)          13 bytes
    	Transaction ID		   2 bytes
    	Protocol ID			   2 bytes
    	Length				   2 bytes (Length = 7)
    	Unit iD				   1 byte
    	Function Code		   1 byte
    	Byte Count			   1 byte  (Byte Count = 4)
    	MSByte register 30775  1 byte
    	LSByte register 30775  1 byte
    	MSByte register 30776  1 byte
    	LSByte register 30776  1 byte

All queried data is included in the response message.
The value are as expected , as is the presentation in OPENHAB.

1 Like

Typo :astonished:
The payload of the response message of test 3 must be 13 instead 12

Edit: Post updated accordingly

This SMA is a naughty box, isn’t it.
Well done pinning it down.

No reason why OH should, given the breaking of TCP fragmentation rules.

It could however; the Modbus byte count is present so we can detect when there’s “more to come”.
Presumably this is what Radzio does, just waits until byte count is satisfied.

Providing we can play nicely with however the host TCP layer is handling this stuff, we have only to wait and see if the rest turns up. This is way out of my pay grade about if the binding needs to interact with TCP to get the next “unexpected” data, or just sit and wait.
I guess the binding will get the padding as well, and need to discard it using the byte count.
As a dedicated diagnostic tool, Radzio may itself be taking some shortcuts that we should not follow in a shared production environment.

Obviously there would need to be a timeout in case the extra chunk never came.

It’s a bit of a decision about if we should be accommodating a “rule-breaking broken product”, which they’re in no hurry to fix, when there is a low-pain workaround which so far as I know always works.

We do need to make sure the binding still deals with “proper” fragmentation in other circumstances, again that depends on the interaction with host TCP layer. I think Sami is on top of that already.

1 Like

What an excellent analysis.

Even though it is not tcp fragmentation I would like you to test out transport bundle with this fix Upgrade jamod (to get rid of case with fragmented packets) by denis-ftc-denisov · Pull Request #2284 · openhab/openhab-core · GitHub

The fix was general in nature and actually might help here as well.

Edit: for what it’s worth I think what we have here is tcp segmentation, very similar to ip fragmentation but still different, and shows differently in wireshark. Difference between IP fragmentation and TCP segmentation - Routing & Switching - NetworkLessons.com Community Forum

I am quite sure that the fix mentioned above actually is the missing piece to the puzzle. Tcp is in the end “steam based” so application layer, such as modbus parsing in java, only see stream of bytes, not individual packets.

The fix actually just means we are not too inpatient and wait for the bytes we need to have. On most systems the OS tcp buffering conveniently shadowed this issue for a long time, as all the bytes of modbus packet appear at once for the application layer.

Hi Ssalonen,
Thanks for the compliment, I am happy to contribute to the forum.
After the weekend I will give the transport bundle fix a try .

Have a nice weekend,

Egon van Os, aka sonave