Did you know you can direct robots to crawl certain pages on your website or to avoid certain pages altogether? It’s true.You can talk to the search engine robots directly from a document called robots.txt. This can drastically help you with your search engine optimization efforts.
It’s easy to create a robots.txt document. You can do it any text editor such as Notepad or Apple TextEdit. Generally, one robots.txt file can be used for your entire site, however, if you have secure pages on your website then you’ll have to have a separate one for those areas. Upload the file to the root directory on your server and from there it will control your own site.
So what should go into your robots.txt file? Any instructions to search engine spiders that tell them which pages you don’t want crawled or, in some cases, which pages you do. You can make your instructions for all robots or for specific search engine spiders.
Let’s say you want to stop the spiders from crawling a particular section of your website. For example, the section on Widgets. If you want to stop all search engines from accessing this page and indexing it then you would just use this phrase in your robots.txt file:
User-agent: * Disallow: /widgets/
Every page within that section would not be crawled by the spiders from that point forward. If you just want to keep Google from crawling those pages then you’d use the following phrase:
User-agent: Googlebot Disallow: /widgets/
The code for other search engines are:
- Yahoo = Slurp
- Microsoft = msnbot
- Ask = Teoma
There is a lot more to learn about robot.txt files, but this should serve as a basic primer.