View Full Version : Robots.txt (no dir) - Meta (index,follow) - *.hta (password)

05-06-2003, 01:31 PM

I have some question about search-engines and site-copying software.


Suppose, I have an index.htm file, a robot.txt file and 2 directories (dir1 and dir2) in the root-dir, like:


I want a search-engine to index, follow etc all files in dir1, but not dir2!

Is it correct for me to think to do:

index.htm + <meta name="robots" content="index,follow">
robots.txt + disable dir2

Or do I miss something?


Suppose I protect all files in dir2 using *.hta (password)

Can I include these files in the index.htm without being ask for a password, like:

In dir2 i have world.gif

In index.htm I have <img src="dir2/world.gif">

Or is the user being ask for a password / how does it work / from within the url?

When Q2 is yes:

I have no *.hta file on the server, can I make one myself and put it in the root directory?

The name of the *.hta-file needs to be .hta (so [0.3])?
The rest of my .hta questions are answered in wsa's perfect tutorial.

I know that after visiting a site all files are on the HD etc, etc and I am not protecting anything in what-ever way.
However, I do not like copy soft-ware copying everything in one time, can I prevent this? I can not do anything server-side!

Thanks for Your effort / ideas,

05-17-2003, 05:13 AM
Supposedly the Robots txt file permits or prevents the access of robots from searching sudirectories that you do not want them into.
Don't count on it.
However, if you want to keep robots and people not authorized OUT of a folder, then basically you can place an .htaccess file in that particular folder... or as you called it *.hta ...
However.. that won't work...
If you are serving your site up form your own computer, and you are in windows then minor problem... however then you can load up apache and in the config file rename .htaccess whatever you want, that works. If you are posting to a web site on the web then you have to call it, .htacess.
place one with retrictions in each folder you do want limited or no access.
adn in each open to the public folder just don't put one.

c:\program files\apache group\apache2\htdocs\
no .htaccess
yes robot.txt
yes index.html etc. etc.
yes login.html to grant access to dir2 after passwrod and
user name confirmed
c:\program files\apache group\apache2\htdocs\dir1
no .htaccess
yes robot.txt
yes index.html etc. etc.

c:\program files\apache group\apache2\htdocs\dir2 (locked)
yes .htaccess needed to lock folder
no robot.txt not needed
yes index.html etc. etc.