Hello, I am running Openhab 2.5.11 on a Debian 9.13 and Zulu Java version 8.0.275-3
and am trying to fetch some data from a website (a small logger for my solarpanels).
I have started by installing the http binding and the regex transformation addon.
I must say that I am a total beginner in both openhab, programming and regex. I mostly get my information from examples here
I am currently interested in the value beneath Gesamtenergie (here: 382988.548)
I have tried a bit with the help of the website Regex101 and figured out that this gets me what I want:
This may not be an elegant method, but it works on the website. But only there. Openhab is not accepting this and I am getting this error
(note: this error may not be for exactly the regex above, but the system error remains the same for all attempts I have made)
Caused by: org.openhab.model.item.binding.BindingConfigParseException: bindingConfig ‘<[http://192.168.50.10/html/de/onlineAdmain.html:60000:REGEX((?<=Gesamtenergie)<.td>
…(.*)(?=<.td>))]’ doesn’t contain a valid binding configuration
I found here on the forum that the Regex in openhab act a bit different than on the website.
So I started a over again with this Regex:
It selects the correct word, but here am I getting lost, because my attempts to get the value in the next line all fails.
 removed htmlcode from website as not shown as desired. Took a picture instead.
 the forum removes some of the format from my regex. * is not shown, I have added some “space” to correct that.
You would probably be wise to move to the HTTP 2 binding as the HTTP 1 binding is not compatible with OH 3 and will not be available when you decide to upgrade. Better to start out with it now instead of needing to change later.
In OH REGEX works a little differently from normal. Your expression needs to match the entire string and then the first matching group (first set of parens) is what get’s returned. So that’s what your second attempt did. The expression does indeed match the full string. And it returned the part of the pattern inside the first set of parens.
You need to use code fences.
code goes here
Some text `code` some more text.
I think the following will return what you are after.
Notice the escaping for the newline (extra ). Also notice we put the parens around the part we want returned.
It could be possible that the page is using Windows two character newlines so if the above doesn’t work you might be able to use something like
That will match one or more characters after the </td>.
In general you want to find unique markers for the start and end of the string you want. Put those two markers on either side of the parens.
Thanks for the heads up, I will note that for my migrationplan.
I used the <code> variant, but I will use the other instead now, thanks.
I think I am getting there understanding the syntax. I have tried your code and there is no error in the log, but my string item shows “space or tab” or something. On a sitemap its just white. It is not NULL.
I tried that one too and this time my stringitem shows
My assumption is that it is matching whitespaces in both cases. Am I correct?
Here is a part of the htmlcode, this time I have copied it from the website directly and not from the Stringitem:
Yes, it seems there is a bunch of white space, which clearly there is. But the .+ should handle that
Well, the original is indented and it’s spaces that are used to implement that So we need to add that to the “marker” before the value you want to return. .* should match 0 or more of any character. .+` should match one or more characters.
That should consume the white space and the open and close tags around the number, returning just the number.
I just wanted to add something. As I went further and created the second and third item, I noticed that I have to be more specific with my regex. I had to go as far as the next unique entry.
So one of my next items looks like this: