![]() | ![]() | ![]() |
| |||||||
| Forums | Register | Groups | Awards | Arcade | Pets | T-Bucks / T-Store | Invite Your Friends | Blogs | Mark Forums Read |
| Web Design Forums and discussions on webdesign |
Web Design | |||||||||
|
|
|
|
| |||||
![]() |
| | LinkBack | Thread Tools |
| | #1 (permalink) |
| Civilians | Hi All Just wondered if you could clarify a few things for me. a) Do the main search engines still look at the robots.txt file for indexing purposes? b) I once set up a robots.txt to disallow certain folders, which I wanted to keep hidden from the rest of the world and therefore didn't want to be spidered, and half a dozen alt.www.webmaster regulars found out about my hidden folders straightaway because everybody looks for www.mydomain.co.uk/robots.txt. Somebody kindly advised me that I should put a blank robots.txt file in the root of my hidden folders so that the search engines ignore them and would-be funny hackers don't see any disallow list, but I feel that I may be missing out in the search engines by not supplying them with a robots.txt file with pages in it. Could somebody put my mind at rest. Many thanks. Rgds Robbie |
|
| | #2 (permalink) |
| Civilians | On Tue, 21 Jun 2005 17 38 +0100, Astra <No@Spam.com> wrote:> Hi All > > Just wondered if you could clarify a few things for me. > > a) Do the main search engines still look at the robots.txt file for indexing > purposes? yep > b) I once set up a robots.txt to disallow certain folders, which I wanted to > keep hidden from the rest of the world and therefore didn't want to be > spidered, and half a dozen alt.www.webmaster regulars found out about my > hidden folders straightaway because everybody looks for > www.mydomain.co.uk/robots.txt. yep. > Somebody kindly advised me that I should put a blank robots.txt file in the > root of my hidden folders so that the search engines ignore them and > would-be funny hackers don't see any disallow list, but I feel that I may be > missing out in the search engines by not supplying them with a robots.txt > file with pages in it. Search engines only look for robots.txt file under the root folder. www.robotstxt.org i > Could somebody put my mind at rest. > > Many thanks. > > Rgds > > Robbie > > -- |
|
| | #3 (permalink) |
| Civilians | Once upon a time, far far away GreyWyvern <spam@greywyvern.com> muttered >Yesh. All bots which comply with the de facto robots.txt standard should >read it and obey. No, it's a courtesy only. And one that really only applies to search engine spiders. It certainly doesn't extend to automated surfing systems and research robots which emulate or replicate human surfing. Matt |
|
| | #4 (permalink) |
| Civilians | And lo, Matt Probert didst speak in alt.www.webmaster: > Once upon a time, far far away GreyWyvern <spam@greywyvern.com> > muttered > >> Yesh. All bots which comply with the de facto robots.txt standard >> should read it and obey. > > No, it's a courtesy only. And one that really only applies to search > engine spiders. It certainly doesn't extend to automated surfing > systems and research robots which emulate or replicate human surfing. O_o Did you even read what I just wrote? Grey |
|
| | #5 (permalink) |
| Civilians | Astra wrote: > b) I once set up a robots.txt to disallow certain folders, which I wanted to > keep hidden from the rest of the world and therefore didn't want to be > spidered, and half a dozen alt.www.webmaster regulars found out about my > hidden folders straightaway because everybody looks for > www.mydomain.co.uk/robots.txt. Easy solution. In robots.txt put: User-Agent: * Disallow: /hidden/ In "/hidden/" don't put anything important. Just include an index.html file like this: <html> <title>go away</title> <p>go away</p> </html> Put all your *real* hidden stuff in subdirectories. e.g. http://www.example.com/hidden/admin-interface/ http://www.example.com/hidden/naked-pictures-of-me/ http://www.example.com/hidden/even-m...ate-than-that/ -- Toby A Inkster BSc (Hons) ARCS Contact Me ~ http://tobyinkster.co.uk/contact |
|
| | #6 (permalink) |
| Civilians | Once upon a time, far far away GreyWyvern <spam@greywyvern.com> muttered >And lo, Matt Probert didst speak in alt.www.webmaster: > >> Once upon a time, far far away GreyWyvern <spam@greywyvern.com> >> muttered >> >>> Yesh. All bots which comply with the de facto robots.txt standard >>> should read it and obey. >> >> No, it's a courtesy only. And one that really only applies to search >> engine spiders. It certainly doesn't extend to automated surfing >> systems and research robots which emulate or replicate human surfing. > >O_o > >Did you even read what I just wrote? I missed the "which" qualifier. Sorry. Matt |
|
| | #7 (permalink) |
| Civilians | And lo, Matt Probert didst speak in alt.www.webmaster: > Once upon a time, far far away GreyWyvern <spam@greywyvern.com> > muttered > >> Did you even read what I just wrote? > > I missed the "which" qualifier. Sorry. No harm done, old chap ![]() Grey |
|
![]() |
| Bookmarks |
| Thread Tools | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| two quick queries :-) | MarkI | Microsoft Applications | 3 | 08-30-2005 08:00 |
| Using LIKE to Perform Access DB Queries | Nathan Sokalski | Web Design | 2 | 07-11-2005 12:00 |
| MS Project queries | =?Utf-8?B?U2lt?= | Microsoft Applications | 2 | 01-17-2005 14:00 |
| Queries on tables from different data sources | John | Web Design | 1 | 06-30-2004 18:00 |
| this workbook contains queries to external data | h zhang | Microsoft Applications | 1 | 06-16-2004 04:12 |
![]() | ![]() | ![]() |