Regex in rule

I’m trying to send a particular youtube video to a chromecast. I’ve got a exec thing running youtube-dl to get the correct web address that the chromecast will accept. My issue is that sometimes it returns the url along with a “WARNING: unable to extract uploader nickname” line at the end. I figured i could just use regex to only capture the first line, but no matter what I try it seems to be capturing the entire string. Is there something simple I’m missing? Still new to making rules.

My rule currently:

rule "Youtube Switch On"
  Item VR_SW_Youtube changed to ON
  var url = transform("REGEX", "(https.+)[\\s\\S]*", Exec_YT_Output.state.toString())

First, it is all but impossible without seeing EXACTLY what string you are trying to match against.

From looking at your REGEX it looks like it is matching: “(https plus one or more arbitrary characters) followed by zero or more white space characters”. Since regex is greedy it tries to find the largest segment that matches this which in this case will be the full string.

What you want is something that matches something like (I don’t use regex enough to type what this would look like but there are tons of google resources):

“(https plus one or more arbitrary characters except for spaces) followed by zero or more spaces followed by zero or more additional characters”

Having said that, it would probably be easier to just use a simple String.split() call than a REGEX.

If the URL and the warning are separated by a space use:

val url = Exec_YT_Output.state.toString.split(' ').get(0)

Sorry I didn’t put the whole string in because it is long and messy and figured the intent was fairly straightforward with the regex. But it starts with a url containing https and then a newline that contains "WARNING: unable to extract uploader nickname"x2 appears some of the times.

So the string ends up as something like:
WARNING: unable to extract uploader nickname
WARNING: unable to extract uploader nickname

I was using the \s not for space but rather any whitespace characters to handle newlines.

I read in the docs that the transform function places the regex inside a ^ and $. Which is why I put [\s\S]* at the end. Drop this into and it seems like it should work. Which is why I wondered if I was just using the transform function wrong and not so much the actual regex expression.

I’ll try the split method. I just wasn’t entirely sure if that would throw an error in the cases where the extra lines weren’t present.

Another way to go is just take the whole string and check if it contains the "WARNING: " text. If so, strip it out and repeat.

. matches everything.

I guess we’re in dotall mode then? Taking into account the anchors too, how about this:

var url = transform("REGEX", "(https[^\\s]+)[\\s\\S]*", Exec_YT_Output.state.toString)

I was confused when I read this. I had to go and google “dotall”.

Do you know if this terrible behaviour is new from Java or if it was borrowed from one of the other regex variants out there?

Confirmed… the transform service has dotall enabled (line 60).

I updated my rule to use [^\s]+ inside the capture group. Haven’t had the warning today, but I think that should solve it. Thank you for that bit of insight. I use regex for work regularly and had never come across the dotall thing.