OH 3 - Profile- Transformations - regex

I try to scrape covid-19 numbers from the RKI webpage here RKI - Coronavirus SARS-CoV-2 - COVID-19: Fallzahlen in Deutschland und weltweit which doesn’t work as expected. When I test my expression here regex101: build, test, and debug regex

expression to test: /Gesamt<\/strong><\/td><td class="right" colspan="1" rowspan="1"><strong>([\d\.]+)/gm

I get the result: 1 match, 1 group

Which is what I do want. I tried several alternatives to get this transform working in a profile setting in the UI in OH3, this is what I entered in the “Profile Configuration - Regular Expression” field:

Gesamt<\\/strong><\\/td><td class=\"right\" colspan=\"1\" rowspan=\"1\"><strong>([\\d\\.]+)

Gesamt<\\/strong><\\/td><td class="right" colspan="1" rowspan="1"><strong>([\\d\\.]+)

Gesamt<\\\/strong><\\\/td><td class=\"right\" colspan=\"1\" rowspan=\"1\"><strong>([\\\d\\\.]+)

Gesamt<\/strong><\/td><td class="right" colspan="1" rowspan="1"><strong>([\d\.]+)

options with leading “/” and trailing “/gm” I tried as well. Unfortunately none of them worked. The only thing I got working is default Profile with no transformation, which give me the whole HTML result. I wonder what I might do wrong and hope someone here is able to guide me.

Thanks in advance
Markus

Have you seen that it already exists a binding for this here ?

Yes, I did and already tried it. Unfortunately it doesn’t fit my needs. Besides that fact, I do have other usecases where working regex profile transformation is needed. Just picked that one because of public availability of the source.

Nevertheless thanks for mentioning!

openHAB REGEX Transformation Service is not fully featured regex. An obvious limitation, because of the uses it is intended for (e.g. manipulating a single Item state), it can only return one match - no arrays or similar.

Does this mean my match need to be my capture and I need to rewrite my expression to fit these requirement? Because currently I do have only one number value captured, but it is different from the match.

Maybe this willhelp

I grab most number from a json source, except one, that is RegEx
My Corona set up is:

.things file:


Thing http:url:lubu "Lubu CoVid-19"   [baseURL="https://www.ludwigsburg.de/start/rathaus+und+service/",   refresh="600", timeout="3000"]   {
    Channels:
        Type string : Channel_Corona_S_Ludwigsburg_Incidence    "S LB 7d [%s]"         [ stateExtension="corona+7-tage-inzidenz.html", stateTransformation="REGEX:.*?<br>Stadt Ludwigsburg ([+-]?([0-9]*[,])?[0-9]+)?.*"]
}



Thing http:url:rki "RKI CoVid-19"   [baseURL="https://api.corona-zahlen.org/", commandMethod="GET", delay="500", refresh="600", timeout="1000"]   {
    Channels:
        Type number : Channel_Corona_LK_Ludwigsburg_Incidence    "LK LB 7d [%.1f]"         [ stateExtension="districts", stateTransformation="JSONPATH:$.data.08118.weekIncidence"]
        Type number : Channel_Corona_Berlin_Incidence    "ST Berlin Treptow 7d [%.1f]"        [ stateExtension="districts", stateTransformation="JSONPATH:$.data.11009.weekIncidence"]
        Type number : Channel_Corona_SK_Saarbruecken_Incidence    "LK SB 7d [%.1f]"        [ stateExtension="districts", stateTransformation="JSONPATH:$.data.10041.weekIncidence"]
        Type number : Channel_Corona_S_Weiden_Incidence     "Stadt Weiden 7d [%.1f]"     [ stateExtension="districts", stateTransformation="JSONPATH:$.data.09363.weekIncidence"]
        Type number : Channel_Corona_Deutschland_Incidence      "Deutschland 7d [%.1f]"    [ stateExtension="germany", stateTransformation="JSONPATH:$.weekIncidence"]
        Type number : Channel_Corona_Deutschland_R              "Deutschland R [%.2f]"     [ stateExtension="germany", stateTransformation="JSONPATH:$.r.value"]
}

.items file

 
Group     gCorona                                                                      (gHaus)
String    Corona_S_Ludwigsburg_Incidence      "S Ludwigsburg 7d [%s]"       <line>     (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:lubu:Channel_Corona_S_Ludwigsburg_Incidence", expire="12h"}
Number    Corona_LK_Ludwigsburg_Incidence     "LK Ludwigsburg 7d [%.1f]"    <line>     (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:rki:Channel_Corona_LK_Ludwigsburg_Incidence", expire="12h"}
Number    Corona_Berlin_Incidence             "Berlin Treptow 7d [%.1f]"    <line>     (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:rki:Channel_Corona_Berlin_Incidence", expire="12h"}
Number    Corona_SK_Saarbruecken_Incidence    "Saarbrücken 7d [%.1f]"       <line>     (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:rki:Channel_Corona_SK_Saarbruecken_Incidence", expire="12h"}
Number    Corona_S_Weiden_Incidence           "Weiden 7d [%.1f]"            <line>     (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:rki:Channel_Corona_S_Weiden_Incidence", expire="12h"}
Number    Corona_Deutschland_Incidence        "Deutschland 7d [%.1f]"       <line>     (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:rki:Channel_Corona_Deutschland_Incidence", expire="12h"}
Number    Corona_Deutschland_R                "Deutschland R [%.1f]"        <chart>    (gCorona)    [ "Corona","Measurement" ]    {channel="http:url:rki:Channel_Corona_Deutschland_R", expire="12h"}

 

.sitemap file


sitemap corona label="Corona" {
  Default item= gCorona
}
1 Like