[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [New Search]

Re: [T3] bling bling


<x-flowed>--On Friday, February 13, 2004 07:25 AM -0800 Greg Merritt <gregm@vwtype3.org> wrote:

p.p.s.: If anybody knows a trick to get google to crawl
http://archive.type3.org, well, do it... (ok, putting the url in this
post is me doing a trick just for that purpose, but what the heck?)

Your robot.txt file - <http://archive.type3.org/robots.txt> shows this: +++ # kthx # HTH HAND

User-agent: *
Disallow: /

# shoutout to my homies in emeryville
User-agent: googlebot
Disallow:
+++

The first sections says "everyone keep out" and the second section says "googlebot is allowed"...

Shouldn't the two sections be reversed? The samples at <http://www.robotstxt.org/wc/exclusion-admin.html>, down where it says "To allow a single robot" show the allow section first, and the disallow section second. I would bet that the robot protocol is a top-down protocol... googlebot sees the disallow at the top of the file, stops, and never sees the next section telling it that it's allowed to crawl the archive.

- john in Albuquerque

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
List info at http://www.vwtype3.org/list | mailto:gregm@vwtype3.org

</x-flowed>
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [New Search]