[vox-tech] PHP / CURL

Dave Margolis dave at silogram.net
Mon Sep 4 18:27:13 PDT 2006


On Sep 1, 2006, at 10:35 AM, serendipitu serendipitu wrote:

> I need to READ some data from that page without manually loging in  
> every 24 hours.

PHP/curl makes this pretty easy (depending on how much energy the  
site developers have put into trying to prevent screen-scraping).   
Also, any language that has a curl implementation can also do this  
(PERL is one that comes to mind).

You need a pretty strong understanding of PHP and a basic  
understudying of how HTTP works.  You'll need a webserver that runs  
PHP or a local machine with the PHP command line interface  
installed.  Then you'll need a script.  That script will take a  
series of steps that each represent a login, a link click, a form  
submission, or some kind of user interaction with a website.

The process basically works like this:

First you call curl_init() to get things started.

You need to call curl_setopt() any number of times to define what  
type of call you're going to make (in this case a series of HTTP  
transactions).  These curl_setopt() calls are very similar to the  
command line switches you'd throw at the command line version of curl.

Then you finish up with a curl_exec() and a curl_close().

It took me a lot of ready and trial and error to figure this all  
out.  I'd start here: http://www.php.net/manual/en/ref.curl.php

Every site is different, and it's difficult to tell you what to do  
without having a half.com account.

Dave


More information about the vox-tech mailing list