Has anyone used a web data extractor with OpenHAB?

Hi,
I’m trying to figure out how to extract a number from a webpage, specifically the current electricity price for my area, and parse it into my openHAB system to compute the electricity costs of my apartment at any given time. I’ve found a Firefox extension called ParseHub that can easily get the data for me, but I’m not sure how to get the data from there and into openHAB.

So, has anyone tried this before? If so, how did you proceed?

From what I’ve seen most people use the HTTP Binding and an XSLT transform or regular expression to extract data from a web page. Because ParseHub is a browser extension there really isn’t any way for openHAB to use it.

I see! My issue is that the website in question is dynamic. I have to enter my zip code for the price to appear on the site. ParseHub makes it easy to set up a set of actions to retrieve the data, but as you said; it’s a browser extension.

If you really want a project, you can set up a web proxy to capture the URLs the dynamic page uses to fetch data and see if you can pull it directly. The website doesn’t appear to require a login and that is usually the really hard part. You may be able to pull just the data you care about without needing to deal with the rest of the page.

I suppose that would be possible, but quite far beyond my skill level at this point. Perhaps for the future.

Hi

I’ve had planned something similar and also postponed it to “maybe sometimes later”. While reading your post; I had the idea that introduced another level of complexity but also the possibility to do everything with websites: The usage if the selenium frameword (http://www.seleniumhq.org/) via exec binding. This will remove every possible limit (including dynamic websites, logins, …) but if you have no experience in that field…

If your current problem is “just” to find out what will happen when you request the data via Firefox you could use the network analysis tool included (Menu Tools|Web Developer). This will list you every request and answer involved. For your site there is a POST-Request to https://api.intele.com/connect/contactcentre/cow.aspx with parameter ending with txtPostalCode (quite long name, and Copy/Paste doesn’t work).

Regards
Dieter

1 Like

Hi Jonas,

One of the ParseHub founders here. Once you’ve set up your project, the results are available via our API. All you have to do is make a request to:

https://www.parsehub.com/api/v2/projects/{PROJECT_TOKEN}/last_ready_run/data?api_key={API_KEY}&format=csv

to get the data. This is not a dynamic page, but will update with the latest values of your data. Assuming you have just a single value you’re extracting, will look like

"price"
"$39.99"

which makes it very easy to parse with almost any tool, likely including openHAB.

Hope that helps.

-Serge

3 Likes

That’s a good suggestion! I tried using the developer tool you mention, but I don’t see the POST-request you describe. I see one GetAreaInfo with params “postalcode”:“0478”. I suppose they do the same though? I guess I’m left with using the HTTP-binding either to request it directly from the website or via the ParseHUB API that Serge describes. Either way I have some reading up to do on the HTTP binding. :slight_smile:

Thanks for chiming in, Serge. You’ve created an appealing and easy to use tool in ParseHub! I will try to familiarize myself with the API.

  • Jonas

Hi Jonas
You are right, I should’ve used an existing postcal code :wink:

But when I try this it makes things a lot easier. When I open the request in the browser, it will return in a json response, so there is a webservice running and no need to extract data from the website:

Sending this request: http://www.fjordkraft.no/Templates/Fjordkraft/webservices/PriceMap.asmx/GetAreaInfo?Postalcode=0478 leads to that answer: {“PostalCode”:“0478”,“PostalCodeArea”:“OSLO”,“County”:“Oslo”,“PriceAreaName”:“Oslo”,“PriceAreaId”:1,“AreaPrice”:“21,442”}.
This could be parsed without problems or even used directly in Openhab.

The only problem here is the mapping of the price “AreaPrice”(?) to your price presented by the Website. But usually when there is a Webservice, there should be a documentation. Give google a try for this as my norwegian knowledge level is short to nothing…

Regards
Dieter

Hi Dieter,
That’s great!
So if I get this right, I can create an item using the HTTP binding as shown below? It’s probably a silly question, but I’m struggling with understanding the JSON transformation and how I can implement it in my item. Is there a one-line command I can input in below, or do I need to create a file in the “Transform” folder under Configurations?

Number areaPrice {http="<[http://www.fjordkraft.no/Templates/Fjordkraft/webservices/PriceMap.asmx/GetAreaInfo?Postalcode=0478:1000:]"}

Thanks for your help so far!

Best
Jonas

Hi Jonas

There is an example in the wiki (https://github.com/openhab/openhab/wiki/Transformations) which should solve your problem by expanding your item (haven’t tested it myself).

Something like this would do (even if i don’t want to do your “Homework” :wink: ): Number areaPrice
{http="<[http://www.fjordkraft.no/Templates/Fjordkraft/webservices/PriceMap.asmx/GetAreaInfo?Postalcode=0478:1000:JSONPATH($.areaprice]"}.

The only problem is that it will give you “21,442” and not your price of 26,80 ore/kwh. But for this I can’t really help you. There is a lot of javascript on that page that will probably do some calculation. It might be that there is VAT added to the price?

Regards
Dieter

Hi again!

Thanks again, this is really helping me out! And indeed the price is excl. VAT. Unfortunately what I pay isn’t. :wink:

I was able to get a transform working, but I had to go via XSL as the response seems to be JSON embedded in XML - to my limited knowledge.
Basically, my item looks like this:
String fjordkraft {http="<[http://www.fjordkraft.no/Templates/Fjordkraft/webservices/PriceMap.asmx/GetAreaInfo?Postalcode=0478:30000:XSLT(areaPrice.xsl)]"}

My areaPrice.xsl-file looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output indent="yes" method="xml" encoding="UTF-8" omit-xml-declaration="yes" /> <xsl:template match="/"> <xsl:value-of select="."/> </xsl:template> </xsl:stylesheet>

Then I created a rule to apply a JSON transformation to this output. The rule currently looks like this:
import org.openhab.core.library.types.* import org.openhab.core.persistence.* import org.openhab.model.script.actions.* import org.openhab.core.types.*

rule "getAreaPrice" when Item fjordkraft received update then var String json = fjordkraft.state.toString var priceString = transform("JSONPATH", "$.AreaPrice", json) var price = priceString as DecimalType

postUpdate(areaPrice,price)

end

Now, where I get stuck is in casting the priceString as a DecimalType, or indeed any type that I can do some math on. If I try the approach above, I get the following error:
[ERROR] [o.o.c.s.ScriptExecutionThread ] - Error during the execution of rule 'getAreaPrice': Cannot cast java.lang.String to org.openhab.core.library.types.DecimalType

I’ve tried other approaches I’ve found through Googling, but strangely none seem to work. And I’ve also verified that the JSONPATH-transformation works through println-ing the priceString variable.

I realize it’s probably a silly error somewhere, but I must have tried every combination I can think of now to get the casting to work: :slight_smile:

Best,
Jonas

Try java.lang.Double::parseDouble(priceString). You have to parse from a String to a numerical value you can do math with. It isn’t smart enough to do it for you on the fly like other loosely typed languages can (e.g. JavaScript, Python, etc).

Thanks for the advice, rikoshak! I had actually tried parseDouble(priceString) in the following way:

var price = parseDouble(priceString)

With java.lang.* as an import.

But OpenHAB designer gives the following error:
Couldn't resolve reference to JvmIdentifiableElement 'parseDouble'.
And OpenHAB itself gives this error:
[ERROR] [o.o.c.s.ScriptExecutionThread ] - Error during the execution of rule 'getAreaPrice': The name 'parseDouble(<XFeatureCallImplCustom>)' cannot be resolved to an item or type.

Would I need to import any other libraries?

This

should be

var price = Double::parseDouble(priceString)

When you import java.lang.* you are only getting the classes. You have to refer to the class name to call a method.

Also, java.lang.* is imported by default. you don’t need to import it yourself, but needing to reference the class name still applies.

Hi rlkoshak,

Thanks, after some fiddling I got that working.

I was receiving the following error:
[ERROR] [o.o.c.s.ScriptExecutionThread ] - Error during the execution of rule 'getAreaPrice': For input string: "22,46896"

This had me stuck for a while, but in the end it occurred to me that the comma was the problem. By explicitly declaring the variable as a string, ie. var String priceString, I was able to access the methods in the String class. That way I replaced the comma by a period in the following way:
priceString = priceString.replace(',','.')

The final conversion from the price excl. VAT to the price incl. VAT was combined with the String parsing in this way:

var Double price = Double::parseDouble(priceString)*1.25

It worked, albeit with OpenHAB designer giving me the following error:
Incompatible types. Expected java.lang.Double or double but was java.math.BigDecimal

I haven’t been able to get rid of this error, and I’m not sure if it’s even significant as it seems to work now.

I expected that the localization would have dealt with the comma verses the period.

The error isn’t a big deal. If you want to get rid of it you can change the code to:

var BigDecimal price = Double::parseDouble(priceString)*1.25

If you need to use it as a Double or double later you can then use price.doubleValue later. But if it isn’t broken, don’t fix it. :smile:

1 Like

Came across this topic while searching for a similar approach to get live radiation readings into openhab.

Would like to request your help in getting the current levels from this page: https://remap.jrc.ec.europa.eu/GammaDoseRates.aspx