MQTT EventBUS "Overload"

Hi all of you top nudge programmers. (wink wink)

I have a little problem caused by MQTT-EVENTBUS.
Since I have installed 2 RPI3, and using the Event-BUS, I get the below errors.

2018-01-19 17:52:55.033 [ERROR] [g.mqtt.internal.MqttMessagePublisher] - Error publishing...
org.eclipse.paho.client.mqttv3.MqttException: Too many publishes in progress
	at org.eclipse.paho.client.mqttv3.internal.ClientState.send(ClientState.java:496) [236:org.openhab.io.transport.mqtt:1.11.0.201712060210]
	at org.eclipse.paho.client.mqttv3.internal.ClientComms.internalSend(ClientComms.java:132) [236:org.openhab.io.transport.mqtt:1.11.0.201712060210]
	at org.eclipse.paho.client.mqttv3.internal.ClientComms.sendNoWait(ClientComms.java:156) [236:org.openhab.io.transport.mqtt:1.11.0.201712060210]
	at org.eclipse.paho.client.mqttv3.MqttTopic.publish(MqttTopic.java:107) [236:org.openhab.io.transport.mqtt:1.11.0.201712060210]
	at org.openhab.io.transport.mqtt.internal.MqttBrokerConnection$2.publish(MqttBrokerConnection.java:426) [236:org.openhab.io.transport.mqtt:1.11.0.201712060210]
	at org.openhab.binding.mqtt.internal.MqttMessagePublisher.publish(MqttMessagePublisher.java:175) [272:org.openhab.binding.mqtt:1.11.0.201712060210]
	at org.openhab.binding.mqtt.internal.MqttEventBusBinding.receiveUpdate(MqttEventBusBinding.java:128) [272:org.openhab.binding.mqtt:1.11.0.201712060210]
	at org.openhab.core.events.AbstractEventSubscriber.handleEvent(AbstractEventSubscriber.java:39) [201:org.openhab.core.compat1x:2.2.0.201712062218]
	at org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415) [3:org.apache.karaf.services.eventadmin:4.1.3]
	at org.apache.felix.eventadmin.impl.tasks.HandlerTask.run(HandlerTask.java:70) [3:org.apache.karaf.services.eventadmin:4.1.3]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
	at java.lang.Thread.run(Thread.java:748) [?:?]

I have several options to solve or do a work-around, but I’m not sure what would be the best thing to do, hence this post.

  1. I could install a 3rd RPI3, to handle some of the Items and MQTT Event-BUS communication
  2. Make all the items, MQTT dependend, and stop the Event-BUS
  3. I could stop using MQTT, but is this really necessary?

If I go with a 3rd PI, is it possible to connect all 3 PI’s to the same “My Openhab” account? (to use with Echo DOT)

What do you guys think? What is “Best practice”

Option 4, switch to use QOS 0? When you use QOS 1 or 2 the publishes are transactional meaning it will keep it open until the broker acks receipt of the message. With QOS it is kind of like UDP, fire and forget. This will let the publishes immediately close and perhaps address the congestion issue.

You will have to watch for lost messages though as QOS 0 does not guarantee delivery. Most people never have a problem though.

If you are already at QOS 0, then you can experiment with the async property in mqtt.cfg. Set it to false and see if that helps. It isn’t really well documented but I suspect that if you set it to false it will send the messages one at a time instead of asynchronously and therefore you will only ever have one inflight publish active at a given time which will eliminate this error. Though latency might increase as the messages queue up waiting to be sent. Though I might be misunderstanding this parameter entirely and setting it to false can make it even worse.

The problem might be on the broker side as well. Try playing with the max_inflight_messages and max_queued_messages parameters (or their equivalent in your broker if you are not using Mosquitto).

If none of that works, then I’m not sure what the best option would be. I personally wold file an issue on the openhab1-addons repo to make setting the max inflight parameter one of the options in mqtt.cfg or mqtt-eventbus.cfg. The Paho library should let you set the max to something in the 6400 range, the default is 10. I’m not sure if that will get changed though as I think the main focus is on the MQTT 2.0 binding.

I would not go down the path in any of your three suggestions if you have a lot of Items.

It is not possible to connect multiple OH instances to the same myopenhab.org account.

2 Likes

Thanks a lot for the quick reply.
Option number 4 it is :slight_smile:

But sometimes, fidling with a system, can cause a lot of problems.
Right now, after changing QoS to 0 and persistent to false, I get this:

2018-01-19 19:14:55.166 [ERROR] [org.apache.felix.configadmin        ] - Cannot use configuration org.openhab.mqtt for [org.openhab.core.scriptengine.action.ActionService, org.osgi.service.cm.ManagedService, id=311, 
bundle=270/mvn:org.openhab.action/org.openhab.action.mqtt/1.11.0-SNAPSHOT]: No visibility to configuration bound to 
mvn:org.openhab.io/org.openhab.io.transport.mqtt/1.11.0-SNAPSHOT

??? I can’t get any useful out of the error, can you?

The best I can figure there is something non-parsable in your mqtt.cfg or mqtt-eventbus.cfg.

It does seem to be complaining about the MQTT Action though. Perhaps uninstalling and reinstalling the binding and action might help. I’m grasping at straws but there are some known odd issues when both the action and binding are installed and a reinstall of both seems to work.

Does changing the config back to the way you had it before make the error go away?

If I comment out the QoS and the Persistent, the problem goes away, so i guess that a un-install, install could solve it.
Thats what I will do next.

Thanks for your time Rich

That problem (too many publishes in progress) had been bugging me for a few months now and seems to be gone since I have set qos = 0 and async = false. (I needed both.)

It has been going strong for a few days now, nothing came up in the logs.

Thank you Rich ! :slight_smile:

It is my understanding that that qos setting only applies between OH and my broker. In that case, I shouldn’t lose any messages with qos = 0 versus 2 because they are both running on the same box, on a far from being overloaded machine.

I’m not absolutely certain about that. Based on the descriptions the QOS only applying between the sender and the broker doesn’t make much sense to me.

QOS 2 means deliver exactly once. To me, that doesn’t mean anything if it only applies between the broker and OH. But I’ll never claim to be an MQTT expert. I could be wrong.

Looking a bit deeper I’ve found this article to be informative:

So it looks like the QOS is indeed only between the broker and the client. Furthermore, the publisher can choose a different QOS from the subscriber. So, for example, you can configure QOS 0 on OH and, to quote HiveMQ’s article, “the messages will be no more or less reliable than the underlying TCP”. But your subscriber can subscribe at QOS 2 and the broker will deliver the message exactly once to that client, even if OH published at a QOS of 0.

Given this new understanding, I agree, using a QOS of 0 is perfectly reasonable. In fact, one would need to provide convincing justification before choosing anything higher.

Unfortunately, there does not appear to be a way to set the QOS for OH MQTT subscribers. Someone would have to look at the code to figure out what is used (probably 0).

Hurray, I learned something new today! :man_student:

2 Likes