Hi Ben,
I’m using parsehub to scrap the data from a webpage (which has no API unfortunately) but is the only reliable source for Pollen exposure around the area I’m living in.
parsehub is a opensource tool, which is quite easy to use (just click on the data field you want to scrap): https://www.parsehub.com/
Via the parsehub API, I’m first triggering the run (basically asking pasehub to run the querry of the page) via a script and then downloading the data into a txt file.
I’m then using a JasonPath Transformation to querry the Pollen exposure from the txt file via the Item. (where actually @vzorglub was a big help)
Script:
Querry Data:
#!/bin/sh
curl -X POST 'https://www.parsehub.com/api/v2/projects/PROJECTKEY/run?api_key=APIKEY'
Download Data:
#!/bin/sh
curl -X GET 'https://www.parsehub.com/api/v2/projects/PROJECTKEY/last_ready_run/data?api_key=APIKEY&format=JSON' | gunzip > /srv/openhab2-conf/html/pollen.txt
the text file gets saved into the HTML folder, so I can access it.
For me, paseHub gives me a zipped file, so I’m also unzipping it.
Rule:
rule "Pollen"
when
Time cron " 0 0 0/4 1/1 * ? *" // this one cycles every 4 hours.
then
var Timer ParseHubTimer = null
executeCommandLine("sh /srv/openhab2-conf/scripts/parsehub_run.sh")
logInfo("POLLEN", "parseHub Data Run BIRCH initiated")
ParseHubTimer = createTimer(now.plusMinutes(10))[|
// give some time to allow ParseHub to complete the run
executeCommandLine("sh /srv/openhab2-conf/scripts/parsehub_save.sh")
logInfo("POLLEN", "parseHub Data File BIRCH saved")
]
end
Item
I’m querrying the Pollen exposure for the next 3 days:
Number PollenBirch_d0 "Birch today: [MAP(pollen.map):%s]" <birch> (Wetter) { http="<[http://192.xxx.xxx.xxx:8080/static/pollen.txt:60000:JSONPATH($.Birke[0].selection2[0].selection3[0].Belastung)]"}
Number PollenBirch_d1 "Birch tomorrow: [MAP(pollen.map):%s]" <birch> (Wetter) { http="<[http://192.xxx.xxx.xxx:8080/static/pollen.txt:60000:JSONPATH($.Birke[1].selection2[0].selection3[6].Belastung)]"}
Number PollenBirch_d2 "Birch day after tomorrow: [MAP(pollen.map):%s]" <birch> (Wetter) { http="<[http://192.xxx.xxx.xxx:8080/static/pollen.txt:60000:JSONPATH($.Birke[2].selection2[0].selection3[12].Belastung)]"}
Also, I’m transforming the querried data, which you can see in the transform file above.
In the above example, I’ve translated it into English.
The outcome in the sitemap looks like that (in German and showing birch and grass, as birch has no exposure currently where I’m living).
hope it helps. If you want any further guidance, let me know.
Kurt