How can i parse a web page and get information from a table? I have this webpage: http://www.izsu.gov.tr/pages/wiregularoutage.aspx?b=4 which displays daily water outages according to the region, how long will it take end etc. I want to parse this information from the html table and display only the ones that have “BORNOVA” under “İLÇE” column. And i want to display this information under openhab webview or any other part. It is beyond my capability to parse with REGEX expressions. Any help?
Have you had a look at this page?
Thanks for the reply and actually yes, this is the page i based on. I managed to process Yahoo weather using XLST transformation but it was already an xml file. Aside from the REGEX expressions i do not know how to parse html tables, and for REGEX it seems a little bit complicated for me.
So, using jsoup i parsed the web site. Code is as follows:
import org.jsoup.Jsoup;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
public class Izsu {
public static void main(String[] args) throws IOException {
String url = "http://www.izsu.gov.tr/pages/wiregularoutage.aspx?b=4";
System.out.println("Fetching %s..."+ url);
Document doc = Jsoup.connect(url).get();
Element tablo= doc.select("table[class=gwiregulerouot]").first();
Elements satirlar= tablo.select ("tr[style]");
for (int i = 1; i < satirlar.size(); i++) { //first row is the col names so skip it.
Element satir = satirlar.get(i);
Elements kolon = satir.select("td");
if (kolon.get(3).text().equals("GÜZELBAHÇE")) {
System.out.println(kolon.get(2).text());
}
}
}
Now, is there a way i can get this code ino rules, transformations or scripts or any other place in openHAB, or i should do it under linux, create a web site with the java and use as a webview under openhab? Any suggestions??