Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    Senior Coder
    Join Date
    Jun 2008
    Location
    New Jersey
    Posts
    2,535
    Thanks
    45
    Thanked 259 Times in 256 Posts

    Allow robots to access a page, but not visitors

    I'm wondering if there is a method I can use to allow robots/crawlers to visit a page, but forward any visitors that land there (through a search engine or by accident) back to the home page.

    I'm making a wordpress custom post type that's only meant to be viewed in an lightbox. The layout, while semantically correct for search engines, looks really bad outside of the lightbox, so while I want the site to be scanned and noted by search crawlers, I want users to be forward to the home page where they can view the items in the lightboxes as meant.

    Anyone have any idea how I can do this? I also understand if its not PHP thing, something I can achieve through a robots.txt file or .htaccess or something else, and appreciate advice in that direction.

  • #2
    Super Moderator
    Join Date
    May 2002
    Location
    Perth Australia
    Posts
    4,040
    Thanks
    10
    Thanked 92 Times in 90 Posts
    your question is not as straightforward as I first thought it was everything I have thought of to date requires a lot of if/buts and maybes ...cookies seem the ideal answer but I don't know how all robots deal with cookies and javascript.

    e.g. check for a cookie value on that page, if it does not exist then redirect to the frontpage, but you would have to do this via javascript rather than PHP else the robot will never get that far

    you could of course check all the possible robot user-agents ... not sure I really want to check all that on every page load though
    resistance is...

    MVC is the current buzz in web application architectures. It comes from event-driven desktop application design and doesn't fit into web application design very well. But luckily nobody really knows what MVC means, so we can call our presentation layer separation mechanism MVC and move on. (Rasmus Lerdorf)

  • #3
    Senior Coder
    Join Date
    Jun 2008
    Location
    New Jersey
    Posts
    2,535
    Thanks
    45
    Thanked 259 Times in 256 Posts
    Yah, its the same logic issue I've run into... how to determine what is a robot and what is a user...

  • #4
    Senior Coder
    Join Date
    Jun 2008
    Location
    New Jersey
    Posts
    2,535
    Thanks
    45
    Thanked 259 Times in 256 Posts
    As we're back from the weekend, I gave this another look, and still nothing. I tried to see if there was some settings I could apply for robots via a metatag or robots.txt, but came up with nothing. I'm guessing there's no 'standard' list of robots I could use to check referral from or something?

  • #5
    Master Coder felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    6,620
    Thanks
    0
    Thanked 645 Times in 635 Posts
    Some robots masquerade as browsers in order to bypass chnecks that block robots.

    Some people have their browser set so it pretends to be a robot so they can see pages the way a search engine does.

    The second of these is a lot less likely than the first but basically there is no real way to tell the difference without applying some form of CAPTCHA (not necessarily a graphic one but something that distinguishes between what a real person is likely to do and how a robot would behave.
    Stephen
    Learn Modern JavaScript - http://javascriptexample.net/
    Helping others to solve their computer problem at http://www.felgall.com/

    Don't forget to start your JavaScript code with "use strict"; which makes it easier to find errors in your code.

  • #6
    Senior Coder
    Join Date
    Jun 2008
    Location
    New Jersey
    Posts
    2,535
    Thanks
    45
    Thanked 259 Times in 256 Posts
    Hm... very good points... time to tell the client we can't do it his way!

  • #7
    Regular Coder primefalcon's Avatar
    Join Date
    Aug 2008
    Location
    /home/primefalcon/
    Posts
    678
    Thanks
    8
    Thanked 39 Times in 39 Posts
    Quote Originally Posted by firepages View Post
    you could of course check all the possible robot user-agents ... not sure I really want to check all that on every page load though
    once checked you could just write the result to session


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •