Go Back   Trackpads Community > General Discussions > Computer and Technology > Web Design

Web Design Forums and discussions on webdesign

Web Design

Reply
 
LinkBack Thread Tools
Old 06-21-2005, 16:00   #1 (permalink)
Astra
Civilians

 
Default robots.txt queries

Hi All

Just wondered if you could clarify a few things for me.

a) Do the main search engines still look at the robots.txt file for indexing
purposes?

b) I once set up a robots.txt to disallow certain folders, which I wanted to
keep hidden from the rest of the world and therefore didn't want to be
spidered, and half a dozen alt.www.webmaster regulars found out about my
hidden folders straightaway because everybody looks for
www.mydomain.co.uk/robots.txt.

Somebody kindly advised me that I should put a blank robots.txt file in the
root of my hidden folders so that the search engines ignore them and
would-be funny hackers don't see any disallow list, but I feel that I may be
missing out in the search engines by not supplying them with a robots.txt
file with pages in it.

Could somebody put my mind at rest.

Many thanks.

Rgds

Robbie


 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Trackpads Information
Click to Visit
Old 06-21-2005, 16:00   #2 (permalink)
Ignoramus30369
Civilians

 
Default Re: robots.txt queries

On Tue, 21 Jun 2005 1738 +0100, Astra <No@Spam.com> wrote:
> Hi All
>
> Just wondered if you could clarify a few things for me.
>
> a) Do the main search engines still look at the robots.txt file for indexing
> purposes?


yep

> b) I once set up a robots.txt to disallow certain folders, which I wanted to
> keep hidden from the rest of the world and therefore didn't want to be
> spidered, and half a dozen alt.www.webmaster regulars found out about my
> hidden folders straightaway because everybody looks for
> www.mydomain.co.uk/robots.txt.


yep.

> Somebody kindly advised me that I should put a blank robots.txt file in the
> root of my hidden folders so that the search engines ignore them and
> would-be funny hackers don't see any disallow list, but I feel that I may be
> missing out in the search engines by not supplying them with a robots.txt
> file with pages in it.


Search engines only look for robots.txt file under the root folder.

www.robotstxt.org

i

> Could somebody put my mind at rest.
>
> Many thanks.
>
> Rgds
>
> Robbie
>
>



--

 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 06-21-2005, 16:00   #3 (permalink)
Matt Probert
Civilians

 
Default Re: robots.txt queries

Once upon a time, far far away GreyWyvern <spam@greywyvern.com>
muttered
>Yesh. All bots which comply with the de facto robots.txt standard should
>read it and obey.


No, it's a courtesy only. And one that really only applies to search
engine spiders. It certainly doesn't extend to automated surfing
systems and research robots which emulate or replicate human surfing.

Matt

 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 06-21-2005, 16:00   #4 (permalink)
GreyWyvern
Civilians

 
Default Re: robots.txt queries

And lo, Matt Probert didst speak in alt.www.webmaster:

> Once upon a time, far far away GreyWyvern <spam@greywyvern.com>
> muttered
>
>> Yesh. All bots which comply with the de facto robots.txt standard
>> should read it and obey.

>
> No, it's a courtesy only. And one that really only applies to search
> engine spiders. It certainly doesn't extend to automated surfing
> systems and research robots which emulate or replicate human surfing.


O_o

Did you even read what I just wrote?

Grey
 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 06-21-2005, 16:00   #5 (permalink)
Toby Inkster
Civilians

 
Default Re: robots.txt queries

Astra wrote:

> b) I once set up a robots.txt to disallow certain folders, which I wanted to
> keep hidden from the rest of the world and therefore didn't want to be
> spidered, and half a dozen alt.www.webmaster regulars found out about my
> hidden folders straightaway because everybody looks for
> www.mydomain.co.uk/robots.txt.


Easy solution.

In robots.txt put:

User-Agent: *
Disallow: /hidden/

In "/hidden/" don't put anything important. Just include an index.html
file like this:

<html>
<title>go away</title>
<p>go away</p>
</html>

Put all your *real* hidden stuff in subdirectories. e.g.

http://www.example.com/hidden/admin-interface/
http://www.example.com/hidden/naked-pictures-of-me/
http://www.example.com/hidden/even-m...ate-than-that/

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact

 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 06-21-2005, 20:00   #6 (permalink)
Matt Probert
Civilians

 
Default Re: robots.txt queries

Once upon a time, far far away GreyWyvern <spam@greywyvern.com>
muttered

>And lo, Matt Probert didst speak in alt.www.webmaster:
>
>> Once upon a time, far far away GreyWyvern <spam@greywyvern.com>
>> muttered
>>
>>> Yesh. All bots which comply with the de facto robots.txt standard
>>> should read it and obey.

>>
>> No, it's a courtesy only. And one that really only applies to search
>> engine spiders. It certainly doesn't extend to automated surfing
>> systems and research robots which emulate or replicate human surfing.

>
>O_o
>
>Did you even read what I just wrote?


I missed the "which" qualifier. Sorry.

Matt


 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old 06-21-2005, 20:00   #7 (permalink)
GreyWyvern
Civilians

 
Default Re: robots.txt queries

And lo, Matt Probert didst speak in alt.www.webmaster:

> Once upon a time, far far away GreyWyvern <spam@greywyvern.com>
> muttered
>
>> Did you even read what I just wrote?

>
> I missed the "which" qualifier. Sorry.


No harm done, old chap

Grey
 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
two quick queries :-) MarkI Microsoft Applications 3 08-30-2005 08:00
Using LIKE to Perform Access DB Queries Nathan Sokalski Web Design 2 07-11-2005 12:00
MS Project queries =?Utf-8?B?U2lt?= Microsoft Applications 2 01-17-2005 14:00
Queries on tables from different data sources John Web Design 1 06-30-2004 18:00
this workbook contains queries to external data h zhang Microsoft Applications 1 06-16-2004 04:12


Community Information
Options
Quick Options
Trackpads Non-Commercial Ad
Copyright Information Click to Visit
Time
Server Time
All times are GMT -4. The time now is 20:48.
Copyright
Copyright Information
The header is based off of work by Vipixel.com and modified by this site. Trackpads and the Trackpads Logo are both Registered Trademarks of Jason Edwards and cannot be used without prior written permission.  The only exception is as a link back to this site. Trackpads is a private website run by a small legion of volunteers, 3 dogs, 12.5 cats and an army of small, super smart, bio-engineered mice with pointy hats and tutu's. Search Engine Friendly URLs by vBSEO 3.2.0 RC7
Archive Links
Archive Links
Page generated in 1.03131 seconds with 19 queries