Google sitemap.xml step 2

Waiting for the point here is it.

Step 2: Making a robots.txt

A robots.txt file tells google bot which pages he can index and which he should leave alone. Its very important to limit some files, like images, admin sections, passwords,…etc.

The reason we don’t allow google to index our images, is because people can then steal our bandwidth and copy them. If you got a site with a nice self made template, then don’t allow google to steal your images.

Ok open notepad or any other text editor and past next lines in it:

User-agent: *
Disallow: /cgi-bin
Disallow: /scgi-bin
Disallow: /affiliaterevenue

User-agent: This says what crawler or bot has to follow the following lines of disallow or allow. I use a wildcard and now it says that every bot should follow it. You can add any spider you want. Following example will make it clear

User-agent: Google
Disallow: [i]img[/i]
Disallow: /img

User-Agent: altavista
Disallow: /passwords

First we see 2 spiders, google can’t view our img folder and altavista cant view our passwords folder. When we say img then any folder starting whit img will be disallowed/allowed. So for our example that means that /img, /img-1, /img-2 are disallowed. If we say /img then only the folder img is disallowed.

robots.txt is really powerful, think about it carefully. If you make one mistake your site can end on the bottom. For my site i just disallowed for all the spiders, images map, cgi-bin map and admin map.

When you are done, save it as robots.txt

btw: you can also disallow robots/spiders whit meta tags, in step 3 we talk about those meta tags but I wont really tell about it.

Watch out for some more points…

VN:F [1.1.4_465]
Rating: 0.0/5 (0 votes cast)

Comments on: "Google sitemap.xml step 2" (1)

  1. [...] Step 2 VN:F [1.1.4_465]please wait…Rating: 0.0/5 (0 votes [...]

Leave a comment for: "Google sitemap.xml step 2"

Tag Cloud