Some mqtt messages irregularly do not arrive at openhab

Hello there,

I have a problem with my openHAB 3.2 installation and the mqtt binding. My openhab installation is on a raspberrypi3 together with a mosquitto broker.
From an external sever I send a mqtt message to the mosquitto broker all 20 minutes (that are dsl speedtest values). In a mqtt explorer (or also with mosquitto_sub on the CLI) I recognize the messages, they are there.
For the first time everything works fine in openHAB. I have a generic mqtt device with a channel listening to the specific mqtt channel. In openHAB logs I see the updates of the corresponding items. After 1 or 2 days, the messages no longer appear in openHAB logs and the items are no longer updated. In the mqtt explorer the messages are still there; so they are sent correctly.
In the karaf console I restart this bundle:
257 │ Active │ 81 │ 3.2.0 │ openHAB Add-ons :: Bundles :: MQTT Things and Channels
After that, everything works as usual.
And the strange thing: while the outage of the messages in openHAB other mqtt messages in other channels from other systems works fine!
Does anyone have a suggestion how I can debug this or do you think its a bug in openHAB?

Many thanks and greetings,
Mario

Theses messages are retained, or you’ve just left MQTTExplorer running? If you’ve just left it running, MQTT Explorer will show you what the last message it received was, but that message is gone. It doesn’t actually exist on the broker.

If you mean theses messages are still being sent and MQTT Explorer is still receiving them that’s something different.

Assuming the latter, when the messages stop being received by OH, what is the state of the Broker Thing? What do you see in the logs?

Thank you for your answer!

No, theses are live messages in MQTT Explorer. In this mqtt message there is an additional timestamp, so I am sure, its live.
The state of the broker is always “online”. There is nothing in the log, not even in DEBUG. (But this seems to be normal. Incoming messages are never logged, only outgoing messages are logged.)
But the strange thing is that other messages are received successfully at the same time.

Do you have any transformations on these Channels that stop?

All transformations or profiles are deactivated on this channel :confused:

An obvious watershed for investigation would be determine whether Mosquitto “forgets” openHAB is subscribed, or whether openHAB ignores messages to subscribed channel.

There was a post some time ago that did look at a means to “lose” messages within openHAB - or more strictly in the Paho library I think? But that was associated with a fast message rate. It doesn’t sound like you would get that here.
So while it is unlikely you have machine-gun messages, you might find some of the investigative methods useful

Of course it is possible something else is machine-gunning and causing random losses, and it is just this one infrequent message that you notice being dropped.

If it’s Paho that would apply to OH any more. They’ve switched to HiveMQ’s library since time ago (3.0 release maybe?)

But much of what is discussed in that thread is still relevant.

Once they’ve stopped do they ever get through again? i.e. is it intermittent. It sounds like the subscription has been lost which can only really happen by an unsubscribe or a client dropping its connection, which it might auto connect again. It would have to resubscribe though too unless it had a ‘persistent’ connection type.

Many thanks for all the advice and suggestions. I will try this out and get back to you.

No, when the messages stopped, I have to restart that bundle. Even manual messages did not come through in this topic before restarting.

What is the content of the very last message that makes it through?

What is the content of the first message that does not make it through?

All the messages are nearly the same. They are created every 20 minutes automatically. Like this one:

{
  "type": "result",
  "timestamp": "2022-01-10T21:35:29Z",
  "ping": {
    "jitter": 0.27,
    "latency": 5.424
  },
  "download": {
    "bandwidth": 13377673,
    "bytes": 141854400,
    "elapsed": 11014
  },
  "upload": {
    "bandwidth": 4271161,
    "bytes": 15432480,
    "elapsed": 3615
  },
  "packetLoss": 0,
  "isp": "Vodafone Germany DSL",
  "interface": {
    "internalIp": "192.168.42.50",
    "name": "eth0",
    "macAddr": "DC:DC:DC:DC:DC:DC",
    "isVpn": false,
    "externalIp": "x.y.z.143"
  },
  "server": {
    "id": 28624,
    "host": "speedtest.hk-net.de",
    "port": 8080,
    "name": "Händle & Korte GmbH",
    "location": "Dusseldorf",
    "country": "Germany",
    "ip": "185.6.69.66"
  },
  "result": {
    "id": "xxx-9895-4a6b-add4-xxx",
    "url": "https://www.speedtest.net/result/c/xxx",
    "persisted": true
  }
}

Right, but maybe, just maybe, the content of a particular message is causing something to fail permanently? Might be interesting to see if there’s a pattern with the last good and first bad message.

Thanks for the hint. I will keep that in mind and check that!

Sorry for my late reply. It was very hard to debug, as the error occurs only randomly, and never in the last weeks!
I’m pretty sure, the error occurs due to a special message format of the mqtt-message, which causes the openhab mqtt binding to crash. I’ve added more logging in my code, so I will be noticed, if the error occurs again.
Many thanks for all your suggestions!

Did your logging identify the cause, if so could you advise what it was. I have a similar problem but it only appears to impact a “Thing” and appears to stop subscribing to the message topic, as mqtt continues to work for other “Thing/Item/topic” combinations.

To the wider group is there an in built mqtt message flood protection in OH3 that I may be hitting as I am sending the log messages for an app which during initialisation presents 37 messages over 20 seconds. Wouldn’t have thought that level was over the top though.

There are some limits and some settings that could cause messages to be dropped.

A QOS 0 means the client or broker will only try to deliver the message once. If it makes it great! If not the message is just lost. QOS 1 and QOS 2 guarantees delivery (at least once for QOS 1 and exactly once for QOS 2). However, there is a maximum message queue so if the message rate fills up the queue, even if QOS is 1 or 2, messages will be dropped.

Note that the QOS only applies between the client and the broker.

So I haven’t identified the cause with certainty, but it probably has to do with the Quality Of Service setting. I tested it with QOS 1 and everything worked. But now I am back to QOS 0 and now there is no message loss again. :exploding_head: