PDA

View Full Version : need help from a mod_rewrite guru


RickG
10-31-2002, 02:16 PM
I'm trying to create a rewrite rule which accomplishes the following:

- If the page request is /siteindex.html
- And the User Agent IS NOT Googlebot
- redirect the request to the home page (i.e. /)

Here's the logic behind the request:

I created a site index file which is linked from the home page in order to "help" Google index a site. The good news is this has been effective.

However, despite using meta robots commands, Google has included the site index page in their search results. The visitor behavior I'm seeing in our apache log files suggests folks arriving at this page from search results are not traversing the site as I'd expect (the page is really designed for robots and not very interesting).

I've read through mod_rewrite at apache.org and engelschall's tutorial, but just can't get this right. Any suggestions would be appreciated. Thx !

Roy Sinclair
10-31-2002, 03:28 PM
Place this script in the <head> of your page.


<script type="text/javascript">
window.location = "./"
</script>

RickG
10-31-2002, 04:47 PM
Roy:

Thanks for the response. Can you explain how that snippet of JS will help?

Ökii
10-31-2002, 05:11 PM
Googlebot is NOT able to use javascript - ergo any javascript would only effect real visitors. :)

On a downside, the page would have to be parsed, delivered to client and then the script ran before the new page was requested.
document.location.replace('index.php');
would be better btw as it stops any history being cached.

I'm sure a mod_rewriter will let you know the more professional way of doing it tho.

MCookie
10-31-2002, 06:56 PM
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !^Googlebot
RewriteCond %{REQUEST_URI} siteindex\.html$
RewriteRule .* http://www.domain.com/index\.html [R,L]

If the User Agent is not Googlebot requesting for siteindex.html, redirect to index.html

But it's tricky. Search engines sometimes use ordinary browserstrings to detect cloaking. Cloaking is serving different pages for spiders and browsers. SEs don't like that. So if one day your site has dissappeared from Google's index..
Instead of using this, why not just optimize your homepage?