Could use some help with regex

regex
Tags: #<Tag:0x00007fe057f89050>

(Rune) #22

Thanks guys for trying to help me out. I never managed to get the time/date to work so in the end I gave up. I’ve contacted the developer and suggested that he implements JSON instead, that seems to be much easier to work with :slight_smile:


(Rich Koshak) #23

Have him implement ISO 8601 dates time format with only three values for the milliseconds and the propper formatting for the utc offset with a + at the end or else even with JSON formatting you will have difficulty assigning the date to a DateTime Item.


(Aurelio Caliaro) #24

Hi everyone. I am new to regex, and I am trying to use a string from the serial1 binding that catches the temperature and the energy consumption at this moment. Every second, it returns the following string:

<msg><src>CC128-v1.48</src><dsb>01588</dsb><time>09:28:01</time><tmpr>23.8</tmpr><sensor>0</sensor><id>00079</id><type>1</type><ch1><watts>01247</watts></ch1></msg>

From this, I would like to extract the temperature (the value within <tmpr> and </tmpr>) and the energy consumption (the value within <watts> and </watts>). To extract the temperature, I used the following line:

String CurrentCostTemp				{ serial="/dev/ttyUSB0@57600, REGEX(.*<tmpr>(.*?)</tmpr>.*)" }

However, the Regex expression is not valid (unrecognized), and I don’t understand why. I also tried replacing the capture group by this: (\\d*.\\d*), but without success. Can anybody help?


(Vincent Regaud) #25

You should be able to use the xml transform on that string


(Aurelio Caliaro) #26

Thank you but the Serial binding does not support XSLT. I had tried with the following:

String CurrentCostTemp				{ serial="/dev/ttyUSB0@57600, XSLT(CurrentCostTemp.xsl)" }

But disregarding the content of the xsl file, the error message was: Unrecognized transform: XSLT(CurrentCostTemp.xsl)

So it seems the Serial binding does not use a generic transformation API but only the ones it supports explicitly.


(Vincent Regaud) #27

Did you install the XSLT transform?


(Vincent Regaud) #28

With the XPATH transform

/msg/tmpr/text()

the result is 23.8

Bingo


(Vincent Regaud) #29

And for Watts the XPATH transform is /msg/ch1/watts/text()


(Aurelio Caliaro) #30

OK but from all transformation plugins, the Serial binding only supports Regex…
And yes, I had installed all transformation addons.


(Aurelio Caliaro) #31

So after having tried other transformations, I see in the Serial binding’s source code (here) that effectively only Regex is supported, and everything else throws an error.
So still hoping someone can tell me the right regex code, and to learn better, why the ones I tried don’t work…


(Aurelio Caliaro) #32

However it would also be elegant to being able to transform an Item to another item with one of the transformations. However according to documentation, transformations can only be used in the textual state. Since I am saving the values in a database (with persistence), I need the Item changed, not only the textual state…


(Jerome Luckenbach) #33

The problem is, that this REGEX is based on the corresponding transformation service, but in fact is really special about this binding.

You cant use any other transformation service in a channel configuration like it is possible with regex in the serial binding.
Usually transformations are for use in labels or in rules.
So you could add a rule which handles item updates und transforms the needed values to different virtual items.

About a direct REGEX:

I have no idea if the regex engine openHAb uses, supprts lockahead and lookbehind statements,
so there’s no guarantee about this one, but you could give it a try:

/(?<=<tmpr>)\d{1,}\.\d{1,}(?=<\/tmpr>)/

This searches for the <tmpr> and </tmpr> tag and gets what is inside them.
I don’t know which values are possible with your device, so i have filtered to one or more digits, then a . and again one or more digits.


(Aurelio Caliaro) #34

Thank you @Confectrician. It does not work ‘like this’ as it generates some mismatched input to the .items file. So I’ll try what it might be (maybe some escape characters missing), and otherwise I will add a rule for item updates as you wrote. That seems a good workaround. Thanks again.


(Aurelio Caliaro) #35

Hmmm… just for documentation: the Regex gets accepted in an item definition in this format (backslashes escaped):

serial="/dev/ttyUSB0@57600,REGEX(/(?<=<tmpr>)\\d{1,}\\.\\d{1,}(?=<\\/tmpr>))"

However, it returns nothing. According to the source code in the Serial binding, the binding accesses the Java implementation of Regex, and that one supports lookahead and lookbehind. Anyway, one more step :slight_smile:


(Vincent Regaud) #36

Why don’t you create an item to received the raw data and then use the XPAT transform to get update the item values:

String CurrentCostRaw				{ serial="/dev/ttyUSB0@57600" }
Number CurrentCostTemp
Number CurrentCostWatts
rule "Serial input"
when
    Item CurrentCostRaw received update
then
    if (triggeringItem.state == NULL) return;
    val String rawString = triggeringItem.state.toString
    CurrentCostTemp.postUpdate(transform("XPATH", "/msg/tmpr/text()", rawString))
    CurrentCostWatts.postUpdate(transform("XPATH", "/msg/ch1/watts/text()", rawString))
end

(Aurelio Caliaro) #37

Wow… phantastic! I tried to find this myself but continually ended up in some error. Your solution worked straight away (besides the closing “)” after the updates :slight_smile: )
Thank you very much, @vzorglub, very very valuable. I learned a lot today about openhab.


(Vincent Regaud) #38

You’re welcome
This is a common approach when a large amount of data for several items comes in.
Please mark the solution post. Thanks


(Aurelio Caliaro) #39

I can’t set it as solved, probably because I didn’t open the topic…


(Kl8ter) #41

I would also need help in regex.
I spent some time to get Aussentemperatur value from this HTML code with help of this website http://regex101.com:

HTML code:

<html> <head><title>Modbus Mapping - Guntamatic</title>
 <meta charset="iso-8859-1"><style>table { border-collapse:collapse; text-align: center; }table, td, th { border:1px solid black; padding: 2px; }th {vertical-align: top; }
 </style></head><body><h1>Modbus Mapping</h1>
 <table> 
 <thead> <tr><th>Id</th><th>Register</th><th>Adresse</th><th>Typ</th><th>Einheit</th><th>Größe<br>(Byte)</th><th>Name</th><th>aktueller Wert</th></tr> </thead> <tbody> <tbody> <tr><td>0</td><td>0x4001</td><td>0x4000</td><td>string</td><td> </td><td>4</td><td>Betrieb</td><td><code>0x41555300</code>AUS</tr> <tr><td>1</td><td>0x4003</td><td>0x4002</td><td>float</td><td>°C</td><td>4</td><td>Aussentemperatur</td><td><code>0x40f1d8e2</code>7.56</td></tr> <tr><td>3</td><td>0x4007</td><td>0x4006</td><td>float</td><td>°C</td><td>4</td><td>Kesseltemperatur</td><td><code>0x4253c0e5</code>52.94</td></tr> </tbody></table> <h2>Erweiterte Texte</h2><table> <thead> <tr><th>Id</th><th>Register</th><th>Adresse</th><th>Größe<br>(Byte)</th><th>Name</th><th>aktueller Wert</th></tr> </thead> </tbody></table></body></html>

There I managed the temperature value. With the help of:.*Aussentemperatur.*?<\/code>.*?([^<]*) to read the Aussentemperatur.

Items:

String Gmc_aussentemperatur "Außentemperatur [%s]" <temperature> {http="<[guntamatic:30000:REGEX(.*Aussentemperatur.*?</code>.*?([^<]*))]"}

The result in openHAB:

/ html>

although the result on http://regex101.com is different.
What am I doing wrong? I suspect that it could be due to the \ / sign …


(Rich Koshak) #42

How to ask a good question Item 3. This is probably best asked as a new topic rather than resurrecting a months old topic.

You might have better luck using the Xpath transform if this is will formed HTML.

Do not escape the /. It is not needed and changes the meaning of the /.

The regex in OH isn’t quite the same as what is used in regex101. You must generate a pattern that matches the entire document with the part you want to return to your item, and only that part in the ().