Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 6 of 6
  1. #1
    Regular Coder
    Join Date
    Feb 2008
    Posts
    119
    Thanks
    23
    Thanked 2 Times in 2 Posts

    user_agent listing

    I'm trying to list all of the spiders that visit my site.
    I wrote this array, but don't know if it works or not.
    Either it doesn't work, spiders haven't visited my site, or I have the user_agent handles incorrect.
    By looking at the code, can anyone see if I've missed anything.

    I searched for a list of common crawlers, but haven't found a proper one, yet.
    And if anyone knows of a decent list, please point me towards it.

    PHP Code:
    $bot = array("Googlebot","Google Feedfetcher","Google","AdsBot-Google","Baidu","Exabot","Scooter","FAST-WebCrawler","Ask Jeeves/Teoma","Slurp","HenryTheMiragorobot","Lycos_Spider_(T-Rex)","Yahoo","Yahoo!","Yahoo! Slurp China","Yahoo! Slurp","Majestic-12","W3C","SEO Crawler","MSNbot Media","MSN","MSNbot","Gigabot","Alexa"); 

    foreach (
    $bot as $v) {

       if(
    eregi("$v",$HTTP_USER_AGENT)) {
       echo 
    "<span class='botcolor'>".$v."[BOT]</span>&nbsp;&nbsp;";
    }

    Last edited by Jedi Knight; 01-24-2010 at 09:11 PM.

  • #2
    Regular Coder
    Join Date
    Feb 2008
    Posts
    119
    Thanks
    23
    Thanked 2 Times in 2 Posts
    OK, I had the user_agent handles wrong, I think they are correct now.
    I also changed my code to pretty it up a bit.
    I still don't know if it works, due to no bots on my site at the present.
    I've shortened the list for posting here, but I'll post the whole list (50 bots) if it works and any ones interested.

    PHP Code:

    $bot 
    = array(
    'AdsBot-Google' => 'AdsBot [Google]',
    'ia_archiver' => 'Alexa [Bot]',
    'Scooter/' => 'Alta Vista [Bot]',
    'Ask Jeeves' => 'Ask Jeeves [Bot]',
    );
    $bot str_replace(" "" "$bot);
    foreach (
    $bot as $key => $value)
    {
       if(
    strcmp($key,$_SERVER['HTTP_USER_AGENT']) == 0) {
       echo 
    "<span class='botcolor'>".$value."</span>&nbsp;&nbsp;";
    }

    Last edited by Jedi Knight; 01-24-2010 at 03:26 AM.

  • #3
    Regular Coder
    Join Date
    Feb 2008
    Posts
    119
    Thanks
    23
    Thanked 2 Times in 2 Posts
    Well it works, sorta.
    There's no use in posting the rest of the list, as most of theme are not correct.
    If I get them all sorted out, and anyone else wants them, I can post them at that time.

    As long as the code works, that's all the help I needed.

  • #4
    Regular Coder
    Join Date
    Feb 2008
    Posts
    119
    Thanks
    23
    Thanked 2 Times in 2 Posts
    OK, I still need help.
    How do I add a wildcard to the user_agent string?
    Should it be added into the array or the string I'm comparing?

    Like:
    PHP Code:
    $bot = array(
    'AdsBot-Google'+wildcard character => 'AdsBot [Google]'
    Or:
    PHP Code:
       if(strcmp($key+wildcard character,$_SERVER['HTTP_USER_AGENT']) == 0) { 

  • #5
    Senior Coder
    Join Date
    Jul 2009
    Location
    South Yorkshire, England
    Posts
    2,318
    Thanks
    6
    Thanked 304 Times in 303 Posts
    Quote Originally Posted by Jedi Knight View Post
    OK, I still need help.
    How do I add a wildcard to the user_agent string?
    Change this line:

    Code:
    if(strcmp($key,$_SERVER['HTTP_USER_AGENT']) == 0)
    to:

    Code:
    if (strpos(strtolower($_SERVER['HTTP_USER_AGENT']), strtolower($key)) !== false)
    Last edited by MattF; 01-24-2010 at 10:44 PM.

  • #6
    Regular Coder
    Join Date
    Feb 2008
    Posts
    119
    Thanks
    23
    Thanked 2 Times in 2 Posts
    I think I realize why it isn't working.
    I would have to actually be the bot, before I would see this in the list.

    So how would I list all bots, without the use of a db?


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •