(solved) How to ensure regex provide full multiline json data to JSONPATH?

Hi,

I’m in the beginning of OH2 implementation at home. Basic stuff works pretty well now. Now I try to use MQTT to get data from my Domoticz instance that I want to keep for now.

Every messages are sent to a common MQTT topic (domoticz/out). The value I want to read is always associated with “svalue1” name, and sometime additional with “svalue2”, “svalue3”. Exported messages in the MQTT looks like this:

{
“Battery” : 255,
“RSSI” : 12,
“description” : “”,
“dtype” : “Current”,
“id” : “10”,
“idx” : 195,
“meterType” : “Energy”,
“name” : “Teleinfo Courant”,
“nvalue” : 0,
“stype” : “CM113, Electrisave”,
“svalue1” : “12.0”,
“svalue2” : “0.0”,
“svalue3” : “0.0”,
“unit” : 0
}

So I want to provide the right values to each item. To get the right data to the right item, I can use the idx value which is unique for each item. So I tried, without success, to use a regex before the jsonpath to ensure only the right message with the right idx value is used by the item. There are no error in the logs, the item just do not capture the values.

My item definition looks like this:

Number EnergyHousePercent "Teleinfo Pourcentage de Charge [%.1f %%]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue1):REGEX(.*\"idx\".*196,$)]"}

My presumption is that the regex return only one line and I did not identify any modifier to make the result multiline.
NB: I have another collection of messages I get from another common topic on MQTT with only 1 line and then this method works well.

  • How should I do to confirm that presumption ? What log should I look at and would I need to change the log level ?
  • How should I get the full multiline json back from the regex instead ?
  • How should I proceed with another way to have the same result ?

Thanks for your kind and patient attention ^^

I believe your REGEX is incorrect.

First of all, since REGEX is the only transform that can be used for matching, the binding does not require you to use REGEX(), just put in the expression. See the examples in the README.

Next, the REGEX needs to match against the whole document, not just the one line you are looking for. So you need to provide something to match the rest of the String, not just the one line. So your REGEX should look something like:

.*\"id\".*196,.*

for a binding config that looks something like:

mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue1):.*\"idx\".*196,.*]"

If “196” is likely to appear elsewhere in the JSON after idx, you may want to use something like

.*\"idx\" *: *195,.*

or

.*\"idx\"\s*:\s*195,.*

That is because your REGEX as written matches the entire message since it is only one line.

events.log will tell you if the message is getting to your Item. You can put the MQTT binding into debug or trace level logging and you should see it parsing the incoming messages in openhab.log. I’m not sure what all gets logged so it should be helpful.

I think your REGEX is not returning anything because you are not matching on the full message so nothing is being returned. That part of the config is really only providing a “true/false” to the binding and if it returns true, then the whole message gets passed to the JSONPATH transform.

You don’t really need to. But alternative approaches would be to write a rule and match and parse out the data to populate the correct Items. Only one of your Items would be bound to MQTT and the rest would get their value from this Rule.

Hi Rich, thanks for your help !

Actually, I believe I tried almost every variation. There are several trap here. The first is that using the character ‘:’ in the regex breaks the item parsing.

With this item definition:

Number EnergyHousePercent "Teleinfo Pourcentage de Charge [%.1f %%]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue1):.*\"idx\" *: *196,.*]"}

I get then errors like:

2017-10-13 21:38:01.746 [ERROR] [el.item.internal.GenericItemProvider] - Binding configuration of type 'mqtt' of item 'EnergyHousePercent' could not be parsed correctly.
org.eclipse.smarthome.model.item.BindingConfigParseException: 
Configuration 'nasmqtt:domoticz/out:state:JSONPATH($.svalue1):.*"idx" *: *196,.*' is not a valid inbound configuration: Configuration requires 4 or 5 parameters separated by ':'

In addition, the “\s” is generating another error. With:

Number EnergyHousePercent “Teleinfo Pourcentage de Charge [%.1f %%]” {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue1):.“idx”\s.\s*196]"}

I get then:

2017-10-13 21:43:31.195 [WARN ] [el.core.internal.ModelRepositoryImpl] - Configuration >     model 'Teleinfo.items' has errors, therefore ignoring it: [1,76]: mismatched character 's' >     expecting set null
[1,137]: mismatched input '*' expecting RULE_STRING
[1,140]: missing '}' at 's'
[1,141]: extraneous input '*' expecting RULE_ID

I tried with double quote instead \s, but with the same result.

The only properly syntaxes that are parsed correctly are one of this one:
.*\"idx\".*196,.*$ or .*\"idx\" . 196,.* or .*\"idx\"...196,.* or .*\"idx\".*196,.*, but then no data is acquired :frowning:

Agreed ! At least it demonstrate that MQTT binding configuration by itself is ok, and that issue is to be found somewhere else.

Ok, that’s very important to me to understand what’s going on, and I should define my items !

Obviously next step is to get MQTT binding in debug log and try to get more detail about the regex result.

So the error with the \s is probably because the \ is being interpreted before the expression gets to the REGEX engine. Try \\s.

Well, if the binding doesn’t like the ‘:’ you can replace that with the punctuation character class.

.*\"idx\"\\s*\\p\\s*196,.*

Clearly, it doesn’t like the spaces so we can’t replace the \s with just a space.

Can you guarantee the same number of spaces before and after the ‘:’? For example, always one space, the :, and one space? If so you could use something like

.*\"idx\".{3}196.*

Though the parser will probably not like the { }. Hmmmm.

.*\"idx\".?.?.?196.*

Obviously it is best to be as strict and exact as possible in a REGEX, but you know the format of the string you are matching against and can make assumptions that let you use a less strict REGEX. In this case I’m assuming there will be three or fewer characters, ANY characters, between the “idx” and the 196. You can add more or fewer .? if there can be more white space between them. The likelihood that that will mismatch seems pretty remote given the message format.

In fact, the following might be sufficient:

.*idx.*196.*

I suspect the likelihood that you will ever have a 196 after the idx line is pretty remote given the example message. I could be wrong though.

Well, even this regex:

.*idx.*195.*

does not work:

2017-10-13 23:22:01.857 [DEBUG] [.mqtt.internal.MqttMessageSubscriber] - Skipped message '{
"Battery" : 255,
"RSSI" : 12,
"description" : "",
"dtype" : "Current",
"id" : "10",
"idx" : 195,
"meterType" : "Energy",
"name" : "Teleinfo Courant",
"nvalue" : 0,
"stype" : "CM113, Electrisave",
"svalue1" : "5.0",
"svalue2" : "0.0",
"svalue3" : "0.0",
"unit" : 0
}
because Message Filter '.*idx.*195.*' does not apply.

Hmmmm.

OK, let’s simplify and build up the expression bit by bit.

Does

*.195.*

work?

How about

.*idx.*

?

No, .*idx.* does not work.

Even .* does not work !!

I even tried to catch the svalue1 line, with .*svalue1.* , but it’s not working either. At least that confirm that the behavior is really that the full message should match. That’s why I still believe that the regex does not parse the multiple line.

I cleaned up the items and sitemap to keep only 2 items, the one that does not work with multiple lines, and the one reading these messages:
{
"temperature" : "27.6"
}
with this item definition:
Number Enclosure1Temperature "Temperature [%.1f °C]" <temperature> {mqtt="<[nasmqtt:3DPrint/Enclosure1:state:JSONPATH($.temperature):.*\"temperature\".*]"}
and there I get the value, while the non matching message in the same topic are properly filtered:
2017-10-13 23:36:48.478 [DEBUG] [.mqtt.internal.MqttMessageSubscriber] - Skipped message '{ "airQualityIndex" : "20" }' because Message Filter '.*"temperature".*' does not apply.

A quick correction, the working message content is indeed on one line only, it’s just the “json Pretty Format Decoder” of MQTT.fx that present it with multiple lines.

Do you have the REGEX transformation installed?

That is very important because the REGEX works on the message as it is delivered. The pretty printer could add all sorts of extra characters and stuff to the message.

Take out the regex match and look in events.log to see exactly what the message looks like that you recieve.

Yes REGEX is installed. I doubt the other one line will work without it with the item definition as I mention just above:
Number Enclosure1Temperature "Temperature [%.1f °C]" <temperature> {mqtt="<[nasmqtt:3DPrint/Enclosure1:state:JSONPATH($.temperature):.*\"temperature\".*]"}

And looking at the DEBUG trace for the non matching of the second regex, you can see it’s “clean”. And I know how the message is delivered to the MQTT, it’s my own code :wink: (coming from this: https://github.com/ChrisP-Git/PartSense )

Well, I still not really know where to go to fix my issue, despite the huge amount of help from Rich (thanks a lot !).

Any other idea to make a regex match multiple lines ?

Well, here is the thing. Is your message actually coming across as multiple lines? You said the example you posted above is being pretty print formatted by MQTT.fx. The REGEX gets the original unformatted JSON message. At this point without seeing an actual example unmolested message we can’t even say that it is coming across as multiple lines. Sadly, I don’t have time to read through a bunch of Arduino code to parse out how your code is actually constructing the message.

And since the REGEX’s above were developed to match that pretty formatted version of the message they could be completely wrong.

I don’t know if this will help but you can try adding \g to the end of the REGEX pattern to tell it to search globally (I think it goes at the end, maybe it goes at the start.

Have you tried one of the online REGEX checkers? Make sure the REGEX works there first.

If the message is coming across on multiple lines, and since you have full control over the message, why not change it so it does come across on only one line? I’m not convinced that is the problem but it is worth trying.

If all else fails, drop the REGEX match and use a Rule to filter and forward the messages to the right Items.

Sorry, I think I was not that clear and confusing.

I have 2 family of messages, one coming from domoticz and one generated by my Particule sensor device.

My device generate collection of one line json data, Using regex, I can properly filter the messages sent by my device. When I was talking about MQTT.fx pretty formater, I was talking about these message that are not using 3 lines but a single line message.

domoticz generate multi-line json messages, it’s those that cannot be filtered using Regex, and I cannot change the content of these one.

I think I already tried the \g modifier, but I will try again. (I need first to rebuilt my test Openhab device, I made some … stupid mistakes).

Maybe at the end I should learn to make rules as you suggest, but I still wonder if this is an attended behavior or if I should raise an issue.

Again, thanks for your help !

OK. So the message is coming across exactly as listed in message 5 above.

I tested the .*idx\".?.?.?195.* regex here and it seems valid.

I looked in the source code and the binding uses java.lang.String.matches to apply the regex to the message. Some further searching around reveals there is a difference between String.find and String.matches where find returns true if the search regex is anywhere in the String and match return true only if the entire String matches the regex. So that confirms that the REGEX has to take into account the entire message, not just part of the message.

Further research is showing that . does not always automatically match newlines. I think that might be the problem.

We can enable the DOTALL flag (i.e. have . also match against newline characters) using:

(?s).*idx\".?.?.?195.*

See if that works.

You’re gonna have to eventually. :wink:

1 Like

Yes ! This one just make the trick, now I have my values properlly filtered and get the right svalue1 value coming from the right idx message !

Thank you so much, I would never figure this one !

Could someone please give me the full working line of code for the item, for anything i tried i can not get a temp value from my mqtt from domoticz to openhab.
I have the incoming message below.

{
  "Battery" : 100,
  "RSSI" : 12,
  "description" : "",
  "dtype" : "Temp + Humidity",
  "id" : "20738",
  "idx" : 32,
  "meterType" : "Energy",
  "name" : "WS Temp-Humidity",
  "nvalue" : 0,
  "stype" : "WTGR800",
  "svalue1" : "17.5",
  "svalue2" : "70",
  "svalue3" : "3",
  "unit" : 0
}

thank you in advance

Hi John, here is my item definition for this type of humidity/temperature data:

Number OregonSDBTemperature "Température Salle de bain [%.lf °C]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue1):(?s).*idx\".?.?.?150.*]"}
Number OregonSDBHumidity "Humidité Salle de bain [%.lf %%]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue2):(?s).*idx\".?.?.?150.*]"}
Number OregonSDBConfort "Confort Salle de bain [%d]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue3):(?s).*idx\".?.?.?150.*]"}

So I guess in your exemple you should use:

Number WTGR800Temperature "Temperature [%.lf °C]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue1):(?s).*idx\".?.?.?32.*]"}
Number WTGR800Humidity "Humidity [%.lf %%]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue2):(?s).*idx\".?.?.?32.*]"}
Number WTGR800Comfort "Comfort [%d]" {mqtt="<[nasmqtt:domoticz/out:state:JSONPATH($.svalue3):(?s).*idx\".?.?.?32.*]"}

Still nothing!! I have install the JSONPath Transformation. Do i need anything else???

Hi John,

yes JsonPath transform need to be installed.

But how did you installed the mqtt binding ? Be aware that MQTT binding is still a 1.X version, so it needs to be configured through the mqtt.cfg file. The mqtt instance name then should match the mqtt name instance used in the thing definition (nasmqtt in my exemple).

How did you install OpenHab ? In my case, I used openhabian to install it on top of an existing ubuntu based ARM device. That way, the log console was installed as well, providing an easy way to consult logs in real time through web on http://:9001 , this would be very helpfull to understand what is not working on your side.

If you could provide some log content, it would definitly help to understand where could be the issue.

(modified ebus.cfg mention to mqtt.cfg)

Thanks for the reply.
My eventbus.cfg is below

# Name of the broker as it is defined in the openhab.cfg. If this property is not available, no event bus MQTT binding will be created.
broker=localbroker

# When available, all status updates which occur on the openHAB event bus are published to the provided topic. The message content will 
# be the status. The variable ${item} will be replaced during publishing with the item name for which the state was received.
statePublishTopic=/domoticz/in

# When available, all commands which occur on the openHAB event bus are published to the provided topic. The message content will be the 
# command. The variable ${item} will be replaced during publishing with the item name for which the command was received.
commandPublishTopic=/domoticz/in

# When available, all status updates received on this topic will be posted to the openHAB event bus. The message content is assumed to be 
# a string representation of the status. The topic should include the variable ${item} to indicate which part of the topic contains the 
# item name which can be used for posting the received value to the event bus.
stateSubscribeTopic=/domoticz/out

# When available, all commands received on this topic will be posted to the openHAB event bus. The message content is assumed to be a 
# string representation of the command. The topic should include the variable ${item} to indicate which part of the topic contains the 
# item name which can be used for posting the received value to the event bus.
commandSubscribeTopic=/domoticz/out

As for openhab is install with “sudo apt-get install openhab2” on a raspberry pi 3 with installed rasbian. i also have access in Karaf console but with log:tail i dont see any errors.