Hello and welcome to our community! Is this your first visit?
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 1 of 1
  1. #1
    Regular Coder
    Join Date
    Mar 2007
    Thanked 0 Times in 0 Posts

    Selective Web Scraping

    Hi all,

    First off,

    Is cURL the best method for web scraping? I've heard a bit about the new loadHTMLfile(), is that just as good? What about file_get_contents?

    Now to my main question: Is there a way to selectively grab content from an html file instead of grabbing the entire contents and then extracting information with preg_match? For example, load an html file until the first preg_match and then end the html loading...Weird I know, but was just wondering if it could be done.

    Finally, I've been getting into making simple little web applications for myself. The most recent one I've been thinking of doing grabs movie information (RT ratings, imdb plot summary, apple trailer, etc.) and displays it in one location. Would something like this be ethically sound? If so, would there be better ways to go about scraping that information than the above methods, i.e. are they the fastest/simplest/etc.?

    Last edited by mlmorg; 03-16-2009 at 04:33 AM.


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts