Jedi Knight
01-23-2010, 10:08 PM
I'm trying to list all of the spiders that visit my site.
I wrote this array, but don't know if it works or not.
Either it doesn't work, spiders haven't visited my site, or I have the user_agent handles incorrect.
By looking at the code, can anyone see if I've missed anything.

I searched for a list of common crawlers, but haven't found a proper one, yet.
And if anyone knows of a decent list, please point me towards it.

$bot = array("Googlebot","Google Feedfetcher","Google","AdsBot-Google","Baidu","Exabot","Scooter","FAST-WebCrawler","Ask Jeeves/Teoma","Slurp","HenryTheMiragorobot","Lycos_Spider_(T-Rex)","Yahoo","Yahoo!","Yahoo! Slurp China","Yahoo! Slurp","Majestic-12","W3C","SEO Crawler","MSNbot Media","MSN","MSNbot","Gigabot","Alexa");

foreach ($bot as $v) {

if(eregi("$v",$HTTP_USER_AGENT)) {
echo "<span class='botcolor'>".$v."[BOT]</span>&nbsp;&nbsp;";

Jedi Knight
01-24-2010, 03:45 AM
OK, I had the user_agent handles wrong, I think they are correct now.
I also changed my code to pretty it up a bit.
I still don't know if it works, due to no bots on my site at the present.
I've shortened the list for posting here, but I'll post the whole list (50 bots) if it works and any ones interested.

$bot = array(
'AdsBot-Google' => 'AdsBot [Google]',
'ia_archiver' => 'Alexa [Bot]',
'Scooter/' => 'Alta Vista [Bot]',
'Ask Jeeves' => 'Ask Jeeves [Bot]',
$bot = str_replace(" ", " ", $bot);
foreach ($bot as $key => $value)
if(strcmp($key,$_SERVER['HTTP_USER_AGENT']) == 0) {
echo "<span class='botcolor'>".$value."</span>&nbsp;&nbsp;";

Jedi Knight
01-24-2010, 04:29 AM
Well it works, sorta.
There's no use in posting the rest of the list, as most of theme are not correct.
If I get them all sorted out, and anyone else wants them, I can post them at that time.

As long as the code works, that's all the help I needed.

Jedi Knight
01-24-2010, 10:10 PM
OK, I still need help.
How do I add a wildcard to the user_agent string?
Should it be added into the array or the string I'm comparing?


$bot = array(
'AdsBot-Google'+wildcard character => 'AdsBot [Google]',


if(strcmp($key+wildcard character,$_SERVER['HTTP_USER_AGENT']) == 0) {

01-24-2010, 11:42 PM
OK, I still need help.
How do I add a wildcard to the user_agent string?

Change this line:

if(strcmp($key,$_SERVER['HTTP_USER_AGENT']) == 0)


if (strpos(strtolower($_SERVER['HTTP_USER_AGENT']), strtolower($key)) !== false)

Jedi Knight
01-25-2010, 09:56 PM
I think I realize why it isn't working.
I would have to actually be the bot, before I would see this in the list.

So how would I list all bots, without the use of a db?

