Proliphix Web Scraper Help

Posted on
Sun Sep 20, 2009 4:29 pm
HFTobeason offline
Posts: 297
Joined: Nov 07, 2003

Proliphix Web Scraper Help

I have two Proliphix thermostats that are in a remote location. Both report to the Proliphix Remote Management web site. I am trying to figure out how to scrape the data off that web site into Indigo variables. The issue I'm faced with is figuring out how to log in to the Proliphix site. The page, "https://access.proliphix.com/Login.php" requires a username/password response. I cannot figure out how to get curl to work in this case. Any help/ideas much appreciated. Thank you.

Posted on
Sun Sep 20, 2009 5:56 pm
jay (support) offline
Site Admin
User avatar
Posts: 18200
Joined: Mar 19, 2008
Location: Austin, Texas

Re: Proliphix Web Scraper Help

HFTobeason wrote:
I have two Proliphix thermostats that are in a remote location. Both report to the Proliphix Remote Management web site. I am trying to figure out how to scrape the data off that web site into Indigo variables. The issue I'm faced with is figuring out how to log in to the Proliphix site. The page, "https://access.proliphix.com/Login.php" requires a username/password response. I cannot figure out how to get curl to work in this case. Any help/ideas much appreciated. Thank you.


I'm not positive this will work, but in many apps that take a url with authentication you use the URL in this format:

https://username:password@hostname.com/whatever

Jay (Indigo Support)
Twitter | Facebook | LinkedIn

Posted on
Sun Sep 20, 2009 9:48 pm
berkinet offline
User avatar
Posts: 3290
Joined: Nov 18, 2008
Location: Berkeley, CA, USA & Mougins, France

(No subject)

Check the curl man page.

Posted on
Mon Sep 21, 2009 8:26 am
HFTobeason offline
Posts: 297
Joined: Nov 07, 2003

(No subject)

Thank you both for your replies. I have tried the form
Code: Select all
curl -k -u UserName:PassWord https://access.proliphix.com/Login.php
with no success - curl returns the HTML of the login page, and does not seem to submit the UserName:PassWord credentials to get to the next page. From the reading I've done, it seems as if I may have multiple layers of problems here - even if I figure out how to get past the login page, the Proliphix site may require cookies, and the page I ultimately want to scrape is presented in a frame. Any further guidance most appreciated.

Posted on
Mon Sep 21, 2009 11:03 am
berkinet offline
User avatar
Posts: 3290
Joined: Nov 18, 2008
Location: Berkeley, CA, USA & Mougins, France

(No subject)

HFTobeason wrote:
...From the reading I've done, it seems as if I may have multiple layers of problems here - even if I figure out how to get past the login page, the Proliphix site may require cookies, and the page I ultimately want to scrape is presented in a frame. Any further guidance most appreciated.

Sorry, I didn't realize the application was that complex. I believe curl can do what you want, but you probably need someone who has a lot more experience with curl than you are likely to find here (on the Indigo forums). I suggest you try the curl mailing lists. Note there are also commercial providers of curl support. They are listed under support on the curl web site.

Please do share your your results.

Posted on
Mon Sep 21, 2009 11:49 am
seanadams offline
Posts: 489
Joined: Mar 19, 2008
Location: Saratoga, CA

(No subject)

This guy has a program for polling and graphing the data from a pholiphix: http://www.anders.com/projects/thermostat-graph/

I just had a brief look at it - he appears to be using a URL which polls the values by OID (in a direct way - not screen-scraping).

You should be able to modify his script, strip out all the graphing stuff, and just have it dump the values you're interested in to STDOUT. Or maybe use to help you figure out how to craft the right query with curl.

Posted on
Mon Sep 21, 2009 11:59 am
seanadams offline
Posts: 489
Joined: Mar 19, 2008
Location: Saratoga, CA

(No subject)

check this out - includes curl examples: http://rtc.rubyforge.org/svn/trunk/docs ... _R1_11.pdf

Posted on
Mon Sep 21, 2009 12:01 pm
berkinet offline
User avatar
Posts: 3290
Joined: Nov 18, 2008
Location: Berkeley, CA, USA & Mougins, France

(No subject)

seanadams wrote:
This guy has a program for polling and graphing the data from a pholiphix: http://www.anders.com/projects/thermostat-graph/

I just had a brief look at it - he appears to be using a URL which polls the values by OID (in a direct way - not screen-scraping)...

I think HFTobeason's problem is a little different. He is not on the same network as his thermostats and needs to get the data via a web reflector (Proliphix Remote Management web site), thus the need for curl to log into the reflector web site first.

If he could punch a hole in his firewall then he could probably read the data directly, as you suggest. But, I assume he has discarded that idea for one reason or another.

Posted on
Mon Sep 21, 2009 12:08 pm
seanadams offline
Posts: 489
Joined: Mar 19, 2008
Location: Saratoga, CA

(No subject)

berkinet wrote:
I think HFTobeason's problem is a little different. He is not on the same network as his thermostats and needs to get the data via a web reflector (Proliphix Remote Management web site), thus the need for curl to log into the reflector web site first.


Ah, ok. Well in that case it IS possible to handle cookie-based authentication using curl, with two successive commands. http://www.google.com/search?q=cookie+login+curl

Posted on
Wed Nov 04, 2009 2:33 pm
lombrano offline
Posts: 19
Joined: Jul 09, 2008

Why not use the API

Why not getting values directly from the thermostat ? I know they've got a web interface and an http API which you can direcltly call. The API is not public, but you can ask the support to send them to you. Ther're free for personal use.
Antoio

Posted on
Thu Nov 05, 2009 3:38 pm
sboutros offline
Posts: 2
Joined: Sep 24, 2008
Location: Oakland, CA

(No subject)

HFTobeason, did you ever find a way to get your data?

Sam

Sam Boutros
-----------------------
http://inthrma.com/

Posted on
Thu Nov 05, 2009 8:30 pm
HFTobeason offline
Posts: 297
Joined: Nov 07, 2003

(No subject)

No, I've totally failed to figure this out.

For the record, I can't get directly to the thermostats because they're behind a modem/router which DOES NOT have a static IP, and there is no computer at that location to run a DynDNS-type client, nor does the modem/router have built-in DDNS capability.

I'm at a loss.

Posted on
Thu Nov 05, 2009 9:39 pm
berkinet offline
User avatar
Posts: 3290
Joined: Nov 18, 2008
Location: Berkeley, CA, USA & Mougins, France

(No subject)

Don't give up hope just yet... You might try taking a look at iMacros.
The iOpus web site wrote:
iMacros lets you record and replay repetitious work. iMacros programmatically interacts with any and all websites. It fills out forms and automates the download and upload of text, images, files and web pages. You can import or export data to/from using CSV & XML files, databases or any other source to and from web applications.

They claim to have a Mac version in beta (scroll to the bottom of the page). They have a free Firefox Add-In. I gave it a quick try and it did learn to log in to the Proliphix remote access web site. I didn't try, but I'd guess you could create a macro to login, select your thermostat and save the contents of the web page (or better, the source) to parse later.

If you used the Firefox Add-In you would probably need to figure out how to call the macro from an AppleScript - it could then continue to parse the saved HTML data after the macro finished.

Posted on
Thu Nov 05, 2009 10:00 pm
berkinet offline
User avatar
Posts: 3290
Joined: Nov 18, 2008
Location: Berkeley, CA, USA & Mougins, France

(No subject)

BTW. If you don't know already, there is a Yahoo Group for Proliphix owners.

Since I also have an interest in this issue, I posted your query there. I'll let you know if there is any response.

Posted on
Thu Nov 05, 2009 10:46 pm
matt (support) offline
Site Admin
User avatar
Posts: 21411
Joined: Jan 27, 2003
Location: Texas

(No subject)

I use the iMacros Firefox plug-in. It works pretty well. Every month I have it download various PDF statements from different Web sites. It won't be an issue with the Proliphix, but the only problem I have is Website pages changing and breaking my scripts (no fault of iMacros).

Image

Who is online

Users browsing this forum: No registered users and 6 guests