[SOLVED] REGEX Transformation

This RegEx performs as expected in the LogReader binding Thing definition - i.e., a log message with that string triggers the log reader.

Rich - I finally figured out how to modify the regular expression to add parentheses as you suggested. This code does what I was after.

rule "Test Rule"
    when
        Item testRule received command
    then
        var String myString = "2019-02-20 12:13:11.615 [WARN ] [t.internal.handler.NESTBridgeHandler] - Nest API error:blocked"
        myString = myString.replace(transform("REGEX", ".*(\s\\[WARN.*NestBridgeHandler\\]).*", myString), "")
end

This replaces the match and the end result is
myString = "2019-02-20 12:13:11.615 - Nest API error:blocked"

Mike

The REGEX Transform needs to match the whole String and put parens around the parts you want to extract. regex101 doesn’t require this so you can match parts of a String and get a match.

At a minimum, you need to add a .* to the front and back of your expression. Then you need to add parens around that part of the String you want to extract.

1 Like

Rich,

Thanks for the response… but I don’t understand where/how to add the parentheses.

As you’ve surmised, I want the match to return "WARN ] [t.internal.handler.NESTBridgeHandler" (actually " [WARN ] [t.internal.handler.NESTBridgeHandler] ". I had originally modified my ReGex to be " \\[WARN.*NESTBridgeHandler\\] ").

In other words, sTmp would end up with the string that I would use to swap for nothing (i.e., .replace(sTmp, "")) so I would retain the timestamp and the logged message.

Mike

If all you want is the timestamp it would be easier to use split.

var String sTmp = myString.split('[').get(0).trim

The split breaks the string into parts using ‘[’ as the delimiter. The timestamp is the first part of the steering so we get(0). Finally, there is a space after the timestamp so we use trim to remove the trailing space.

To extract the timestamp using Regex you just need something like (. *) [. *

To extract the log message you can do the same. split on ] and get(2) as the message is the third part when splitting by ].

Yeah, I thought about spilt, just thought that REGEX would allow me to grab that entire substring and just cut it out in one fell swoop. Looking at this technique in general for multiple purposes. So, trying to figure out REGEX in general and with openHAB in particular.

To do it with REGEX you need to find markers in your original String that define the start and end of what you want to match. Then define the pattern of what you want to match and put that inside parens. Finally, you need to define the markers for where the part you want to match ends.

So, if you are looking to match the WARN and class name your starting marker is [ and your ending marker is ] -

As mentioned previously, the REGEX needs to match the full text. So we end up with something like

.*(\s[.*])\s-.*

The first .* matches everything up to the first instance of \s[ in the String. \s represents the space.

The second .* matches all the text up to the first instance of ]\s-.

The final .* matches everything from that point to the end of the String.

This gives us an expression that matches the entire text.

Now we need to tell it what part of the text we actually want. So we put a ( ) around the \s[.*].

Rich,

Thanks for breaking it down. I’d gotten it to work (I updated my original post w/ my solution for posterity)… but I honestly didn’t understand why. This helps.

Mike

Well you just implemented what I described, only your regular expression is a little more specific. My regular expression will work with any log statement that follows the pattern. Your regular expression does essentially the same thing but it only works with WARN level log statements from the NestBridgeHandler.

I do see that my regular expressions are missing the escapes for the [ and ] which I would need.

Put into words, your regular expression means:

  1. .* matches all the characters up to the first \s[WARN
  2. The second .* matches all the characters after \s[WARNuntil the first instance of NestBridgeHandler]
  3. The third .* matches the rest of the String.

The placement of the parens say we want to return everything that matches inside the parens. In this case, 2 above.

We need 1 and 2 above though so that the expression matches the whole text.

Note that this is not standard REGEX behavior and that is why you didn’t see the same results as when you used regex101.

1 Like