MQTT: why do some topics have an instant response when queried?

  • openHABian v3 on rPi 4 4GB

This is a generic MQTT query… (hence, I did not state the environment in detail).

When I subscribe to a topic, I can see messages pop in when they are sent to the topic.

I also have a topic that instantly gives me a message that has been published in the past.
It is this topic that has me puzzled, as to why that is.

More specific: I have a tank level meter that:

  • reports the water level, which it publishes every 60 seconds
  • subscribes to a minimum level
  • subscribes to a “permit advertising my water demand” topic (AllowDemand [on|off])
  • publishes a demand message (when permitted to do so) (WaterDemand [on|off])

This controller has been working flawlessly for almost ten years. (No firmware or other system changes where made.)
That is, when the tank level is under the demand level, it advertises its demand and fills the tank up to 100%. Done. Easy.

However, in recent days, it may occasionally fill the tank to any random level above the minimum demand level and stops filling.

I have analysed the controller code, and for the life of me, I cannot find a fault in the software.

However, what I did observe is, that whenever I subscribe to AllowDemand, it shows OFF; even when it was just switched ON, and the controller has received ON.

Here are three terminals I have used.
The first to issue ON

# [2023-10-20 15:34] maxg@x570 ~ $ 
mosquitto_pub -h -t ArgyleCourt/Property/IrrigationTank/AllowDemand -m ON

The second was subscribed to the same topic:

# [2023-10-20 15:31] maxg@x570 ~ $ 
mosquitto_sub -h -v -t ArgyleCourt/Property/IrrigationTank/AllowDemand/# | xargs -d$'\n' -L1 sh -c 'date "+%Y-%m-%d %T.%3N $0"'
2023-10-20 15:32:50.007 ArgyleCourt/Property/IrrigationTank/AllowDemand OFF
2023-10-20 15:34:43.760 ArgyleCourt/Property/IrrigationTank/AllowDemand ON
2023-10-20 15:45:00.423 ArgyleCourt/Property/IrrigationTank/AllowDemand OFF

The third was subscribed to, a minute after publishing ON:

# [2023-10-20 15:36] maxg@x570 ~ $ 
mosquitto_sub -h -v -t ArgyleCourt/Property/IrrigationTank/AllowDemand | xargs -d$'\n' -L1 sh -c 'date "+%Y-%m-%d %T.%3N $0"'
2023-10-20 15:37:09.735 ArgyleCourt/Property/IrrigationTank/AllowDemand OFF
2023-10-20 15:45:00.423 ArgyleCourt/Property/IrrigationTank/AllowDemand OFF

The OFF at 15:45 was published by openHAB (as programmed.

To answer the question “why?” … we run electricity as an off-grid system (though with the grid as backup), and the tank is only allowed to publish its demand between 14:00 and 15:45, when we have the most excess solar PV generation.

Here an example log output from OH, where filling water was swithced off at a random level, and I issued the AllowDemand via command line publish to keep the filling going.

2023-10-20 14:01:00.768 [INFO ] [openhab.core.model.script.Water.4.01] - Tank_Irrigation_Demand_Allow........: ON
2023-10-20 14:01:50.314 [INFO ] [openhab.core.model.script.Water.8.01] - TheHub: Irrigation tank valve is....: ON (at 84 %)
2023-10-20 14:01:50.377 [INFO ] [penhab.core.model.script.Water.10.01] - Bore Pump: total starts.............: 1747
2023-10-20 14:30:50.294 [INFO ] [openhab.core.model.script.Water.6.01] - Tank_Irrigation_Demand..............: OFF
2023-10-20 14:30:50.298 [INFO ] [openhab.core.model.script.Water.8.01] - TheHub: Irrigation tank valve is....: OFF (at 93 %)
2023-10-20 14:30:50.312 [INFO ] [penhab.core.model.script.Water.11.01] - Bore Pump...........................: stopped
2023-10-20 15:26:50.250 [INFO ] [openhab.core.model.script.Water.8.01] - TheHub: Irrigation tank valve is....: ON (at 93 %)
2023-10-20 15:26:50.300 [INFO ] [penhab.core.model.script.Water.10.01] - Bore Pump: total starts.............: 1748
2023-10-20 15:45:00.769 [INFO ] [openhab.core.model.script.Water.5.01] - Tank_Irrigation_Demand_Allow........: OFF
2023-10-20 15:45:20.222 [INFO ] [openhab.core.model.script.Water.6.01] - Tank_Irrigation_Demand..............: OFF
2023-10-20 15:45:20.222 [INFO ] [openhab.core.model.script.Water.8.01] - TheHub: Irrigation tank valve is....: OFF (at 99 %)
2023-10-20 15:45:20.237 [INFO ] [penhab.core.model.script.Water.11.01] - Bore Pump...........................: stopped

So the question is: why is AllowDemand always OFF when I subscribe in a terminal to it?

While I make heavy use of MQTT, I use it in its basic config, no last will testament, or quality of service, other than the defaults.

Any hints where to look for the culprit would be much appreciated.

My theory is, if what I am seeing, that AllowDemand is always OFF, then what publishes OFF? But, why does the controller keep filling, if “I” see OFF, but the controller sees ON. I don’t know that for sure, but it must be that way, because the controller reacts to OFF (as published at 15:45).

Yes, this is doing my head in; and after spending two days on solving this puzzle, I thought I post here. Thanks.

Maybe I don’t understand, because here it shows as ON

Assuming that the question is actually

Then I would investigate both the hardware and openHAB as my first port of call (based on the information in this post).

How old is the hardware? Could the sensors be failing? Or other electronics in this system?

Is there anything else in openHAB which is interacting with the hardware? Any test code that is still running? Is openHAB exposed to the internet, and someone is messing with it?

Note that when you see OFF via your terminal, you’re seeing the MQTT information that your MQTT Broker has. Maybe there’s a communications issue between your broker and the water hardware?

What does the rule look like?

What if you subscribe to

mosquitto_sub -h -v -t ArgyleCourt/Property/IrrigationTank/#

That way you will see both WaterDemand and AllowDemand

1 Like

Retain may be the cue.

If sending a message to the broker, you can set the message to be retained. Mosquitto will then save the message. If a client subscribes to this topic, the retained message is delivered instantly.

If using an mqtt sniffer such as MQTT Explorer, you can see if a topic is retained.

Yes, you seem to be onto something, what I suspected, but can’t figure out. Any MQTT message only shows up when a msg has been sent. This one, AllowDemand is always OFF when I connect to it, as it did just now, opening MQTT Explorer.

Select AllowDemand and take a look at the right side :slight_smile:

1 Like

Genius :slight_smile: I should learn to understand my tools better.
Yes, it has a RETAINED label.

Who the heck did set that? (A rhetorical question, of course).
But where do I unset it?

I just deleted the retained label/message.
This seems to have done it.

Maybe in my wisdom some umpteen years ago, I did this somehow to ensure, if the system ever goes down it would come with OFF, as a clean slate so to speak!? Best guess.

So to theorise, if the tank controller where to loose the connection, or would reconnect, it would get the retained OFF msg and of course, shut down.

I will monitor this for a few days and report back.

Thank you, Greg and T for having a go at this. I do appreciate it.

1 Like

In the MQTT settings of the device…

mosquitto_sub -h -v  --retained-only -t ArgyleCourt/Property/IrrigationTank/# 

Should show you the retained topics.

1 Like

Thanks… this is interesting; I found a few others, where I have no idea how the retained flag was set in the first place.
While the hot water system may follow the same logic as AllowDemand (which is only a guess), I have no idea how (and by what) the retain flag was set.

# [2023-10-22 07:46] maxg@x570 ~ $ 
mosquitto_sub -h -v  --retained-only -t ArgyleCourt/#
ArgyleCourt/Shed/RF433TX/Command 31
ArgyleCourt/Shed/RC41/HotWaterSystem OFF
ArgyleCourt/Shed/Modbus/Command 8051,0,0
ArgyleCourt/Property/PumpStation2/Command resetMeter
ArgyleCourt/Property/PumpStation1/Valve 0000

My controllers are all self-made, and have no settings that would allow any config parameter updates.

Anyway, I got rid of these flags.

A retained message is a normal MQTT message with the retained flag set to true. The broker stores the last retained message and the corresponding QoS for that topic. Each client that subscribes to a topic pattern that matches the topic of the retained message receives the retained message immediately after they subscribe. The broker stores only one retained message per topic.

So that means it is NOT SENDING an OFF for example it is just showing you the last setting for that topic as soon as you connect to the broker.

If something is sending an OFF command then it shouldn’t be because of the retained flag. That’s how I understand it.

1 Like

So, what does a controller (subscriber) do with this msg? Say, it rebooted, lost connection and subscribes. Act on it or not?
If not, how can it then distinguish between an actual command and a retained msg?

 * -----------------------------------------------------------------------------
 * @brief Reconnect to MQTT broker
 * if connected:
 *   subscribe to allow demand
 *   subscribe to demand level setting
 * @return MQTT broker connection state
 * -----------------------------------------------------------------------------
bool b_connected_to_broker()
    // trying to reconnect to MQTT broker

    #ifdef DEBUG
        Serial.println(F("MQTT: trying to connect to broker ..."));

    if (mqtt_client.connect(MQTT_CONNECT_ID))

    return mqtt_client.connected();

}   /*! b_connected_to_broker() */

You can’t.

You subscribe to a topic, you get the retained payload.
If the topic isn’t retained, you get the next payload when the broker receives it.

Thanks, that makes sense. Normal operation with ON|OFF commands can happen for however long, but when the controller subscribes again for whatever reason, it will get the retained OFF msg?!

If it works like this, it would allow to start in a desired (safe) state should interruptions in messaging occur.

I think you are only subscribing to the topic and you get the last payload of that topic. You are not publishing to it, unless maybe the topic is ALSO the command topic and the status topic?

My bet is still this. Something has changed. All the chat about retained messages might be a symptom, but it’s unlikely to be the cause.

Again, I would check that the hardware isn’t becoming faulty. 10 years is a long time.

And those MQTT topics aren’t necessity the exclusive domain of the controllers. As mentioned, there might be old openHAB rules or test scripts running somewhere which are interacting with these topics?

Or it’s open to the internet.

If something set those topics to retained, it’ll do it again.

I hear you :slight_smile:
I currently work with hope, that is: expecting a change without doing anything.
No seriously… you’re raising valid points.

I need another three of these tank level sensors; one of them modified to read bore water levels… which will be the same design and code. If the new ones show a similar problem then either my code or MQTT interactions are flawed.
If not, the current controller may well have a hardware problem.

Is there no way that you can log what your hardware is doing - sensor readouts etc - so you don’t have to fiddle with the hardware itself for the moment?

Try to log as ‘close’ to the hardware as possible - removing as many middlemen as possible!

1 Like

Yes… this is the plan… while writing the code I have lots of DEBUG msg (Serial.print) to spit out things of interest.

It only occurred to me a few days ago, while thinking about this problem, to add publish commands (instead of print statements) to the code, with the DEBUG triggered via a MQTT subscribe. That way I can enable this in times of trouble(-shooting). :slight_smile:

Just be careful that the MQTT interactions aren’t adding an extra fault layer!