[SOLVED] Help with XML parsing of Web data

Hi,

Would really like to get weather data from a local weather station. Its’ website is showing the values in Javascript animations, but I’ve found that if I use “view-source” I get the data in xml format.

Can someone help me with parsing the data to OH items?

view-source:https://www.scanmatic.no/vaeret-i-kilsund/

Thanks
Christian

Post the string please and describe what you want to get.

Hi,

From the link i pasted above, I would like to get the value-data from:

1.  <label>Luft <small>(&deg;C)</small></label>
2.  <label>Vann <small>(&deg;C)</small></label>
3.  <label>Nedbør <small>(mm/t)</small></label>
4.  <label>Luftfuktighet <small>(%)</small></label>
5.  <div class="windmeter">
<div id="luft-level" class="level" data-value="24.2">

<div id="vann-level" class="level" data-value="21.2">
<div id="vann-label" class="indicator">21.2 &deg;C</div>

<div id="nedbor-level" class="level blue a" data-value="0">

<div id="luftfuktighet-level" class="level blue b" data-value="99.8">
<div id="luftfuktighet-label" class="indicator blue">99.8%</div>

<div class="windmeter">
<div class="info" title="23&ordm; NNØ">NNØ</div>
<div class="info b" title="0.7 m/s">0.7 m/s</div>

Thanks Harry, but how do I parse this into OH items?

Hi @hr3,

thanks for bringing up again my previous post. I have the same issue, I hope you can help me. I would like to get from (for example)

http://meteoalarm.eu/documents/rss/ee/EE002.rss

all the alerts for Today (so all the

alt="awt:5 level:3"

statements (numbers depend of alert and level) before “Tomorrow”.

Or, much better, all alerts for Today AND Tomorrow.

Then, for example, if I have alt=“awt:5 level:3”, I would like to collect in a OH item the number 53 (where I will associate a map to describe alert and level).

Does it make any sense?

Thanks
Andrea

@sjef86
The html string is NOT a valid XML unfortunately
So I used node-red

The function node contains:

msg.payload = msg.payload.trim().split(" ")[0]
return msg;

Someone created a python script for my issue:

https://www.domoticz.com/forum/viewtopic.php?t=19519

mmm … trying to understand how it works, and to find a way to have it working in OH :frowning:

Hi Vincent,

I should have you on speeddial :smiley:
Seems like you solve all my problems, regarding OH2 :wink:

But would you mind showing the property window of each function in Node, so I can manage to replicate it?

Thanks
Christian

OK

The first node is just an inject node. You can add a schedule to it depending on how often you need the data pooled

The second node is an http request:

image

The third node is an html function node:

image

The fourth node is the function node as above
The fifth node is a debug node but you can use mqtt or an openhab node to send the value to OH

Add other html nodes to the output of the http request node to get your other values

See:
https://www.w3schools.com/cssref/css_selectors.asp
and
https://www.w3schools.com/cssref/trysel.asp

For the use of the css selectors in the html node.

Good luck

1 Like

Thank you very Much!! :smiley:
This is awesome :+1:

Glad you like it
Please like and mark the thread as solved, thanks

@ariela, did you get any luck?

@vzorglub frankly speaking, not yet :frowning:

I’m not very smart with that :frowning:

I found another solution

String s "s[%s]" {http="<[https://www.scanmatic.no/vaeret-i-kilsund/:60000:JS()]"}
rule s
when
	Item s changed
then
	var String s1 = s.state.toString
	var i1 = s1.indexOf('<div id="luft-level" class="level" data-value="')+47
	var i2 = s1.indexOf('"',i1)
//	logInfo("___",i1.toString + " "+ i2.toString)
	logInfo("___",s1.substring(i1-47,i2+2) +"|"+ s1.substring(i1,i2))

	i1 = s1.indexOf('<div id="vann-level" class="level" data-value="',i2)+47
	i2 = s1.indexOf('"',i1)
	logInfo("___",s1.substring(i1-47,i2+2) +"|"+ s1.substring(i1,i2))
end
2018-07-31 12:59:35.145 [INFO ] [g.eclipse.smarthome.model.script.___] - <div id="luft-level" class="level" data-value="27.6">|27.6
2018-07-31 12:59:35.152 [INFO ] [g.eclipse.smarthome.model.script.___] - <div id="vann-level" class="level" data-value="21.3">|21.3

The :JS()]"} brings a warning, does anybody know’s a better way to transfer the value to the item-state?

This is a HTML-code-example you have to check in the rule like my example.

			<link>https://www.meteoalarm.eu/ee_EE/0/0/EE002.html</link>
			<description><![CDATA[<table border="0" cellspacing="0" cellpadding="3"><tr><th colspan="3" align="left">Today</th></tr><tr><td width="28"><img border="1" src="https://www.meteoalarm.eu/documents/rss/wflag-l3-t5.jpg" alt="awt:5 level:3"></td><td><b>From: </b><i>31.07.2018 14:41 CET</i><b> Until: </b><i>01.08.2018 14:41 CET</i></td></tr><tr><td width="28"></td><td>Õhutemperatuuri maksimum sisemaal 30..33°C. Metsades on suur tuleoht!
Maximum air temperature in inland 30..33°C. High risk of forest fire!</td></tr><tr><td width="28"><img border="1" src="https://www.meteoalarm.eu/documents/rss/wflag-l2-t8.jpg" alt="awt:8 level:2"></td><td><b>From: </b><i>31.07.2018 14:41 CET</i><b> Until: </b><i>01.08.2018 14:41 CET</i></td></tr><tr><td width="28"></td><td>Õhutemperatuuri maksimum sisemaal 30..33°C. Metsades on suur tuleoht!
Maximum air temperature in inland 30..33°C. High risk of forest fire!</td></tr><tr><th colspan="3" align="left"><br />Tomorrow</th></tr><tr><td width="28"><img border="1" src="https://www.meteoalarm.eu/documents/rss/wflag-l3-t5.jpg" alt="awt:5 level:3"></td><td><b>From: </b><i>31.07.2018 14:41 CET</i><b> Until: </b><i>01.08.2018 14:41 CET</i></td></tr><tr><td width="28"></td><td>Õhutemperatuuri maksimum sisemaal 30..33°C. Metsades on suur tuleoht!
Maximum air temperature in inland 30..33°C. High risk of forest fire!</td></tr><tr><td width="28"><img border="1" src="https://www.meteoalarm.eu/documents/rss/wflag-l2-t8.jpg" alt="awt:8 level:2"></td><td><b>From: </b><i>31.07.2018 14:41 CET</i><b> Until: </b><i>01.08.2018 14:41 CET</i></td></tr><tr><td width="28"></td><td>Õhutemperatuuri maksimum sisemaal 30..33°C. Metsades on suur tuleoht!
Maximum air temperature in inland 30..33°C. High risk of forest fire!</td></tr></table>]]></description>
			<pubDate>Tue, 31 Jul 2018 12:41:11 +0200</pubDate>

Did you try REGEX(.*) ?

String s "s[%s]" {http="<[https://www.scanmatic.no/vaeret-i-kilsund/:60000:REGEX(.*)]"}
2018-07-31 14:23:58.971 [INFO ] [.internal.RegExTransformationService] - the given regular expression '^.*$' doesn't contain a group. No content will be extracted and returned!

and no data-transfer

:REGEX((.*?))]"}

works