[HTTP Binding] - Need help to scrape website

Good day all,

For about a year I have used the HTTP Binding to scrape the garbage collection dates from the collector’s website and this used to work very well. As per last week, the website was updated with the implementation of Javascript and is dynamically loaded based on GET parameters in the URL. I can’t seem to get it to work with the binding anymore and wanted to ask for your help to see if these dates may be extracted some way, despite the language choice of the company.

The website I am looking to scrape is: https://inzamelkalender.meerlanden.nl/modules/800bf8d7-6dd1-4490-ba9d-b419d6dc8a45/kalender/calendar.html?3000055233&Heereweg---4%20%20%20---2161AG---Lisse---Lisse

Disclaimer: the address used in the link is obviously not mine, but belongs to a hotel.

Any chance someone can help me with this?

Many thanks,
Casper.

The URL you linked is getting its data by sending a POST to this URL:
https://wasteprod2api.ximmio.com/api/GetCalendar

You might be able to do the same thing to get the data directly. You’ll probably need to examine the POST that’s being sent and mimic some of the header values.

Hi namcaccr,

Many thanks for the push in the right direction - basically, it was all I needed. I was able to intercept the POST message and recreate it with the below entry in my rules file:

rule "Update ophaaldagen"
when
Time cron "0 0 0 * * ?"
then
val String MY_URL = 'https://wasteprod2api.ximmio.com/api/GetCalendar'
var String myData = '{"companyCode":"800bf8d7-6dd1-4490-ba9d-b419d6dc8a45","startDate":"2019-04-13","endDate":"2019-05-02","community":"BLOCKED","uniqueAddressID":"BLOCKED"}'
var Vuilophalen = sendHttpPostRequest(MY_URL, "application/json", myData)
var GFTs = transform("JSONPATH","$.dataList[0].pickupDates", Vuilophalen)
GFT.postUpdate(GFTs)
var Papiers = transform("JSONPATH","$.dataList[1].pickupDates", Vuilophalen)
PapierOphalen.postUpdate(Papiers)
var PMTOphalens = transform("JSONPATH","$.dataList[2].pickupDates", Vuilophalen)
PMTOphalen.postUpdate(PMTOphalens)
end

One last thing that needs to be sorted is that the date range needs to be dynamic in the POST message, as a broad range will retrieve many dates that will then need to be sorted through. That’s something for tonight to go through.

Many thanks!

1 Like