Monday, January 04, 2010

How to Control Web site Content Crawling by Search Engines:

1. Use a "robots.txt" robots exclusion file in your web application folder
2. Create Custom HTTP Module
3. Use “noindex” page meta tags in content page files (.html,.aspx,.asp,.php,.jsp)
4. Add a "nofollow" meta tag
5. Use X-Robots-Tag in your http headers
6. Use Google Webpage Removal Request Tool

Few links:
http://www.antezeta.com/blog/avoid-search-engine-indexing
http://learn.iis.net/page.aspx/637/managing-robotstxt-and-sitemaps/

No comments: