Website Design Resources - Google Robots.txt File

Web Site Design Resources: Tips on the basic principles of good website design.


Google Robots and the Robots txt File

Robots txt File Explained

Web site owners use the robots txt file to communicate with search engine robots. The robots txt file resides in the root directory of your site, and contains instructions for the web bots when they arrive for indexing.

When a robot reaches a web site, the first file it sees is the robots txt file, in which it looks for instructions as to how to proceed with the site.

The creation and use of a robots txt file gives web site owners control over which directories of their web site can be indexed by the bots, and which directories are forbidden, or "disallowed."

Create a Robots txt File

Creating a robots txt file is as easy as it gets. Any text editor can be used to create the file, even Notepad. The process is as simple as opening a new ".txt" file and naming it "robots." Care should be taken, however, to ensure that the "s" is not omitted at the end of robot and that all letters in the file name are lowercase. Any mistakes and the web robots won't see your robots txt file.

The format of the contents of the robots txt file is very straightforward, as we will illustrate with the following examples.

To exclude all robots from the entire server, include the following text in the robots txt file:

User-agent: *
Disallow: /

To allow all robots access to your web site, include the following text in the robots txt file:

User-agent: *
Disallow:

To exclude all robots from the 'sample' directories, include the following text in the robots txt file:

User-agent: *
Disallow: /sample1/
Disallow: /sample2/

To exclude the Google Robot from the server, include the following text in the robots txt file:

User-agent: Google
Disallow: /

To allow the Google robot access to the server, include the following text in the robots txt file:

User-agent: Google
Disallow:
User-agent: *
Disallow: /

As you can see, the robots txt file is an excellent way to control what parts of your web site are indexed by search engine robots.

Upload the Robots txt File

In order to put your newly created robots txt file to work, you need to upload it to the root directory of your web server. Robots look for the file in the same directory as your main "http://www.zizinya.com" page, your top-level directory.

Security Considerations of the Robots txt File

The robots txt file is a publicly available file, and as such, anyone can see it and thus determine what directories you don't want index. The robots txt file is not intended for access control and should not be used as such. If there are files on your server that you don't want people to see, then you should take the appropriate measures to protect them.