Mqtt on OH2 errors

Hi all,

I’m using Openhabian (OH2) on RPI3B with mqtt binding. I have mqtt setup and working, but i see some errors in my logs (see at the bottom). I have a rule that every half an hour sends a messege to all of my mqtt devices (around 10 devices), and it appears that the error is corelated with this rule. I have 100ms delay after each sendCommand in the rule but the error is still there. Single actions that result in mqtt message send does not trigger the error, so I guess that there is some message queue that gets full. Is there any settings that I should try ?

2017-09-06 08:30:06.724 [ERROR] [g.mqtt.internal.MqttMessagePublisher] - Error publishing…
Too many publishes in progress (32202)
at org.eclipse.paho.client.mqttv3.internal.ClientState.send(ClientState.java:496)[192:org.openhab.io.transport.mqtt:1.10.0]
at org.eclipse.paho.client.mqttv3.internal.ClientComms.internalSend(ClientComms.java:132)[192:org.openhab.io.transport.mqtt:1.10.0]
at org.eclipse.paho.client.mqttv3.internal.ClientComms.sendNoWait(ClientComms.java:156)[192:org.openhab.io.transport.mqtt:1.10.0]
at org.eclipse.paho.client.mqttv3.MqttTopic.publish(MqttTopic.java:107)[192:org.openhab.io.transport.mqtt:1.10.0]
at org.openhab.io.transport.mqtt.internal.MqttBrokerConnection$2.publish(MqttBrokerConnection.java:426)[192:org.openhab.io.transport.mqtt:1.10.0]
at org.openhab.binding.mqtt.internal.MqttMessagePublisher.publish(MqttMessagePublisher.java:175)[186:org.openhab.binding.mqtt:1.10.0]
at org.openhab.binding.mqtt.internal.MqttItemBinding.internalReceiveCommand(MqttItemBinding.java:45)[186:org.openhab.binding.mqtt:1.10.0]
at org.openhab.core.binding.AbstractBinding.receiveCommand(AbstractBinding.java:97)[189:org.openhab.core.compat1x:2.1.0]
at org.openhab.core.events.AbstractEventSubscriber.handleEvent(AbstractEventSubscriber.java:45)[189:org.openhab.core.compat1x:2.1.0]
at org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415)[6:org.apache.karaf.services.eventadmin:4.0.8]
at org.apache.felix.eventadmin.impl.tasks.HandlerTask.run(HandlerTask.java:90)[6:org.apache.karaf.services.eventadmin:4.0.8]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_121]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_121]
at java.lang.Thread.run(Thread.java:745)[:1.8.0_121]

Is there anything that might be informative in the broker’s logs?

I’m using mosquitto as broker, no errors in mosquitto.log.

Try upping the sleep to 1000.

If you have control over the client end, can you have them subscribe to the same topic and have OH just publish the one message?

There is no setting or configuration you can change to deal with this problem.

in mqtt.cfg a have qos = 1 and asynch = false (asynch = true also tested - same result)

i have found this:

as I understand, the library that is used in mqtt binding can handle maximum 10 messages without ack, if i have set qos to 1, then maybe it reaches this limit. but the mosquitto server is ont he same raspberry pi, so there should be no problems, maybe qos = 1 is not needed when broker is on the same machine ?

ok, i changed the sleep to 1s and will see if it helped.

there is no problem with publishing one message at a time, only when in a rule i have couple of messages published.

what happens when message is published with qos=1 and there is a client that subscribed to topic also with qos=1 but for some reason is not recieving the messages (for example the receiving client hangs). Is this mean that the client that is publishing the message will keep this message in the queue forever ?

Correct, but if the devices receiving the MQTT message all subscribed to the same topic, OH would only need to publish one message.

QOS of 1 means at least once so that as long as that client has not acked receipt of the message it will continue to receive that message until it acks it or a new message is sent which overrides the old one…I think.

See the following for a primer on QOS.

all devices subscribe to different topics, can’t change that.

now i’m waiting to see if 1s delay will help, of not i will try to set qos=0

with 1s delays much better but still happens. now trying with qos=0

tried with qos=0 and with no delays between consecutive sendCommand and no errors so far :slight_smile: so it seems that qos=1 is the problem. i have two mqtt clients with the broker on the same machine (mysensors gateway, openhab and mosquitto broker) and 4 clients on other machines but connected to wired network. first question is do I need qos=1 ? and the second, why qos=1 causes those errors ?

If you had read the article on the link I provided you would know.

qos 1 means at least once. Any message that you send at will continue to be resent over and over, with delays of course) until the receiver gets the message and acknowledges it got the message. The receiver may receive the message more than once but it is guaranteed to get the message. The sender appears to wait at least a little bit for the message to be acknowledged before closing out the transaction.

Qos 0 is fire and forget. There is no guarantee the message gets received and the sender doesn’t wait around for an acknowledgement.

When you use qos 1, it takes some time, more time than you were allowing, for the message to be received and acked so you can into errors. With qos 0 it doesn’t wait so you avoid errors.

All of this is explained thoroughly there.

I know what qos=1 means, my question is, is it necesarry to have reliable comunication in my case ?
if broker and a client are on the same machine, what may cause qos=0 message to be not delivered ? something like openhab crash or mosquitto crash, in both cases qos=1 wouldn’t help. second case is when broker and client are on different machines but connected with wired network. unless i don’t pull the ethernet cable plug qos=0 should be sufficient ? I’m assuming here that there is no problems with the hardware, because if there are problems, they should be resolved first.

regarding the errors, if those errors are expected when using qos=1 then they are not errors. I mean the software shouldn’t throw errors in normal conditions. if the message queue is too small, there should be at least configuration for the queue length.

Only you can answer that. You have not provided nearly enough information for anyone else here to recommend one qos over another. In general, I would expect qos=0 to be perfectly reasonable in almost all home automation circumstances except in cases where one has a client that has a really poor wireless connection.

The MQTT client failed to send the message. When software fails to do something I’ve asked of it I consider that an error. Now the error may be my fault (i.e. trying to send too many messages too fast) but it is still an error. It is not a normal condition, the message was never sent. The reason the message was never sent was because it wasn’t done sending the previous messages yet and that was because of the qos=1.

I don’t think this is a matter of the message queue length. It has to do with how long it takes for one client to publish a message and an inability of the broker to receive more than one message (or more than N messages perhaps) from a single client at the same time. And even if it were a matter of queue length, that is a parameter you would set on the broker, not in OH. So it might be worth looking into your broker and researching the error a bit more.

There might be something that could be changed in the parameters used by the MQTT client to address this problem or there might be something one could change on the broker side.

A quick google search found this: