Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
  1. #1
    Regular Coder
    Join Date
    Oct 2009
    Posts
    438
    Thanks
    9
    Thanked 7 Times in 7 Posts

    Get certain values from a website

    All,
    If I have a URL and the URL has certain id's or classes in a div or span, does anyone have any good code to find these? For example, I want to find a span like <span class="street-address">1234 Old Road</span> and I want the data in the span for that. Say there is also another scenario like <div id="bizUrl"><a href="http://www.codingforums.com">Coding Forums</a></div> then in that case I'd like to take the href variable etc.

    Any ideas on the best way to do something like this?

    Thanks in advance!

  • #2
    Senior Coder kbluhm's Avatar
    Join Date
    Apr 2007
    Location
    Philadelphia, PA, USA
    Posts
    1,509
    Thanks
    3
    Thanked 258 Times in 254 Posts
    http://simplehtmldom.sourceforge.net

    Never tried it, but it appears to be quite handy. I would imagine the following syntax would work:
    PHP Code:
    $html file_get_html'http://www.example.com' );

    $text_from_class $html->find'.street-address')->innertext;
    $text_from_id    $html->find'#bizUrl a')->href
    Untested... but if the selectors work just like jQuery, as they claim, that should work.

  • #3
    Regular Coder
    Join Date
    Oct 2009
    Posts
    438
    Thanks
    9
    Thanked 7 Times in 7 Posts
    Thanks I'll try that. What about any info in a meta tag like for example:

    <meta property="og:type" content="restaurant">
    <meta property="og:longitude" content="-87.630765">

    etc?

  • #4
    Senior Coder kbluhm's Avatar
    Join Date
    Apr 2007
    Location
    Philadelphia, PA, USA
    Posts
    1,509
    Thanks
    3
    Thanked 258 Times in 254 Posts

  • #5
    Regular Coder
    Join Date
    Oct 2009
    Posts
    438
    Thanks
    9
    Thanked 7 Times in 7 Posts
    Thank you for the help. I'll let you know if I have any problems.

  • #6
    Regular Coder
    Join Date
    Oct 2009
    Posts
    438
    Thanks
    9
    Thanked 7 Times in 7 Posts
    Ok, for the meta tags it didn't find them because they had a tag of property instead of name. The get_meta_tags only looked for the name field. Anyway that I can get the property content values like in my example?

  • #7
    Regular Coder
    Join Date
    Oct 2009
    Posts
    438
    Thanks
    9
    Thanked 7 Times in 7 Posts
    Yeah, it wouldn't work with the <h1> tag so that is why I opened up a new tag. Anyway, I did have a question on that which I couldn't find an answer to. Based on what you saw, can you get text back and then do another find on the text you received back? It didn't look like that was an option but that would be helpeful.

    When I do something like the following:
    PHP Code:
    $biz_info_content $biz_page->find'div#bizInfoContent')->outertext;
    $cat_display $biz_info_content->find'span#cat_display')->outertext
    I get an error saying: Fatal error: Call to a member function find() on a non-object

    The $biz_page in this example is the original URL I'm trying to crawl.
    Last edited by treeleaf20; 08-24-2011 at 01:39 PM. Reason: Update with variable descriptions.

  • #8
    Senior Coder kbluhm's Avatar
    Join Date
    Apr 2007
    Location
    Philadelphia, PA, USA
    Posts
    1,509
    Thanks
    3
    Thanked 258 Times in 254 Posts
    We will need more code than that... it doesn't appear the variables are a directly returned value from the file_get_html() function.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •