Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
  1. #1
    New Coder
    Join Date
    Jan 2010
    Location
    Birmingham, MI
    Posts
    98
    Thanks
    3
    Thanked 9 Times in 9 Posts

    WGET and the OPTIONS Indexes directive

    So I just discovered wget, and how powerful this tool potentially is. I would like to know how to safegaurd against it if it is at all possible. I am not really sure how it works; I just figured it out, and I am able to recursively download from a couple of my domains. I havn't tested it on my PHP code, just images, so I don't know how the server will actually send the PHP. As PHP code, or as HTML code that the PHP script outputs. If it is by HTTP protocol, I think it will just send the HTML markup but I am not sure.

    Will denying Indexes with the Options directive safeguard against wget or do I have to do some more advanced configuration? Help here is appreciated.

  • #2
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    Quote Originally Posted by JamesOxford View Post
    Will denying Indexes with the Options directive safeguard against wget or do I have to do some more advanced configuration? Help here is appreciated.
    In general, unless you have an explicit need to list the files, you should disable indexing. Spiders can still crawl your pages to retrieve the images/files you use on them(wget can do this), but they can't get a list of everything in your folders and follow it recursively, if you disable the indexes. They also can't see the source of your PHP files because they are parsed by the server when they are requested. An exception would be if you named something .phps or an extension that is not handled by Apache(like .phpbak for example).

    To disable indexes for your site put this in an .htaccess in the document root:
    Code:
    Options -Indexes

  • #3
    New Coder
    Join Date
    Jan 2010
    Location
    Birmingham, MI
    Posts
    98
    Thanks
    3
    Thanked 9 Times in 9 Posts
    Again, thanks for your help. If I disable indexes in an .htacess file in the root directory, would I be able to override it in a sub-directory or no? There are a couple of places where indexes are convenient.

    In directories where I did want to index, would denying spiders in a robot.txt file, and setting a valid-user requirement with basic authentication be sufficient to to stop recursive downloads of the entire folder?

  • #4
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    Quote Originally Posted by JamesOxford View Post
    If I disable indexes in an .htacess file in the root directory, would I be able to override it in a sub-directory or no?
    Yep.
    Quote Originally Posted by JamesOxford View Post
    In directories where I did want to index, would denying spiders in a robot.txt file, and setting a valid-user requirement with basic authentication be sufficient to to stop recursive downloads of the entire folder?
    No, not really. robots.txt is more of a suggestion and only well-behaved spiders will follow it. You may just end up making it easier for people to find the directories you don't want indexed... so they can index them. That is, if you are worried about bad robots to begin with.

  • #5
    New Coder
    Join Date
    Jan 2010
    Location
    Birmingham, MI
    Posts
    98
    Thanks
    3
    Thanked 9 Times in 9 Posts
    At this point it is more of a hypothetical, than a true concern. The basic authentication won't stop them? Won't they get a 404 redirect instead of a 200 OK if they tried to access the directory without authenticating?

  • #6
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    Quote Originally Posted by JamesOxford View Post
    The basic authentication won't stop them? Won't they get a 404 redirect instead of a 200 OK if they tried to access the directory without authenticating?
    Whoops, I didn't see that you were adding authentication. That should be sufficient to block recursive indexing. They will get a 403 if they can't authenticate.

  • Users who have thanked Inigoesdr for this post:

    JamesOxford (08-22-2011)

  • #7
    New Coder
    Join Date
    Jan 2010
    Location
    Birmingham, MI
    Posts
    98
    Thanks
    3
    Thanked 9 Times in 9 Posts
    404 redirect instead of a 200 OK if they tried to access the directory without authenticating?
    I meant 403 .

    Thanks again for all your help.

    BTW, how do I add the user I am quoting when I wrap text in {QUOTE}?

  • #8
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    Quote Originally Posted by JamesOxford View Post
    BTW, how do I add the user I am quoting when I wrap text in {QUOTE}?
    The easiest way is to hit the quote button at the bottom of the post, but you can use this format too:
    [QUOTE=JamesOxford]some text[/QUOTE]


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •