[vox-tech] PHP / CURL

Dave Margolis dave at silogram.net
Mon Sep 4 22:05:38 PDT 2006


On Sep 4, 2006, at 6:48 PM, serendipitu serendipitu wrote:

> Thanks Dave...
>
> I have a RedHat Enterprise Linux 3 machine on which I installed:
> curl 7.15.5 : with SSL
> php 4.4.4 : with curl and command line interface
>
> Unfortunately, the problem is a very basic one!  I can't even read  
> a regular html webpage frpm an http server, let alone the https  
> stuff.  For example:
>
> $LOGINURL = "www.yahoo.com"; //or any other http webpage
> $agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4)  
> Gecko/20030624 Netscape/7.1 (ax)";
> $ch = curl_init();
> curl_setopt($ch, CURLOPT_URL,$LOGINURL);
> curl_setopt($ch, CURLOPT_USERAGENT, $agent);
> curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
> $result = curl_exec ($ch);
> curl_close ($ch);
> print   $result;
>
> I save the above lines in a test.php file and then I type this on  
> linux command line:
> > php -f test.php
> or
> > php test.php
>
> both just print the content of the .php file (the above lines!)  
> instead of the html webpage!!!!!!  and I don't know what's wrong in  
> here; that should be a small bug or a user mistake or something  
> like that....
>
> Would you give me a sample php/curl code and the necessary steps  
> for running it on a linux command line? a simple one, something  
> that does the same thing as "wget www.yahoo.com " for example.
>
> Thanks!

What you've got is perfect - except that every php script (web or  
CLI) needs to be wrapped in <? ... ?> tags.

So put <? at the beginning of your file and ?> at the end and you  
will be good to go.  The script you have dumps the HTML content to  
standard output (you'd see a close approximation of the yahoo site if  
you ran this in a web browsers).  If you want this to behave exactly  
like wget, you'd have to add a few more lines (fopen(), fwrite 
($result), fclose(), etc. to dump the contents to a file).

To get the data you really want, you'll probably have to study the  
HTML a bit and come up with a regular expression to grab exactly what  
you want.  Once you have that, you can do whatever you want - e-mail  
it to yourself, include the scraped data on another webpage, or  
whatever else you can come up with (send yourself a text message,  
perhaps?).

Good luck,
Dave



More information about the vox-tech mailing list