Meta Tag Analysis is one of the most important element of SEO Checklist for any website. This includes the analysis of Meta robots and Robots.txt file. Today I will talk about Meta Robots and Robots.txt, what they are, what they do, the difference between the two and the syntax we use in these cases.
On-page Optimization: Robots Meta Directives & Robots.txt
What is Robots Meta Directives?
Pieces of code that provide crawlers instructions for how to crawl or index a particular web page content. This is placed in the <head> section of the web page. See example: <meta name=”robots” content=”noindex, nofollow”>
Indexation-controlling parameters for meta robots tag:-
- Noindex: Tells a search engine not to index a page.
- Index: Tells a search engine to index a page. Note that you don’t need to add this meta tag; it’s the default. <meta name=”robots” content=”noindex”> or <meta name=”robots” content=”index”>
- Follow: Even if the page isn’t indexed, the crawler should follow all the links on a page and pass equity to the linked pages.<meta name=”robots” content=”follow”>
- Nofollow: Tells a crawler not to follow any links on a page or pass along any link equity. <meta name=”robots” content=”nofollow”>
- Noimageindex: Tells a crawler not to index any images on a page. <meta name=”robots” content=”noimageindex”>
- None: Equivalent to using both the noindex and nofollow tags simultaneously. <meta name=”robots” content=”none”>
or <meta name=”robots” content=”noindex, nofollow”> - Noarchive: Search engines should not show a cached link to this page on a SERP. <meta name=”robots” content=”noarchive”>
- Nocache: Same as noarchive, but only used by Internet Explorer and Firefox. <meta name=”robots” content=”nocache”>
- Nosnippet: Tells a search engine not to show a snippet of this page (i.e. meta description) of this page on a SERP. <meta name=”robots” content=”nosnippet”>
- Noodyp/noydir [OBSOLETE]: Prevents search engines from using a page’s DMOZ description as the SERP snippet for this page. However, DMOZ was retired in early 2017, making this tag obsolete. <meta name=”robots” content=”noodyp”>
or <meta name=”robots” content=”noydir”> - Unavailable_after: Search engines should no longer index this page after a particular date. <meta name=”robots” content=”unavailable_after: 23-Jul-2007 18:00:00 EST”>
Standard Meta Tag Structure:-
<meta name=“robots” content=“[PARAMETER]”>
This is standard, you can also provide directives to specific crawlers by replacing the “robots” with the name of a specific user-agent. Then the structure will be like this:
<meta name=“googlebot” content=“[DIRECTIVE]”>
For example: <meta name=“googlebot” content=“nofollow”> Want to use more than one directive on a page? See example:- <meta name=“robots” content=“noimageindex,” “nofollow,” “nosnippet”>
What is Robots.txt file?
A file that gives bots suggestions for how to crawl a website’s pages. Robots meta directives provide more firm instructions on how to crawl and index a page’s content. This is a separate file that is placed inside the root directory and outside any sub-folder.
The syntaxes used in robots.txt file
SYNTAX | WHY USED |
User-agent: * | Allowing all web crawlers |
User-agent: Googlebot | Blocking a specific web crawler |
Disallow: / | Blocking all web crawlers from crawling any folder |
Disallow: /cgi-bin/ | Blocking a particular folder named cgi-bin |
Disallow: /tmp/ | Blocking a particular folder named tmp |
Disallow: /~joe/ | Explanation |
Disallow: /example-subfolder/blocked-page.html | Blocking all web crawlers from a specific web page |
Other relevant references:
- Google Webaster Answers
- Robots Meta Directives by Moz.com
- Robotstxt.org
- Robots.txt explained by Moz.com
- Google Developers
- Yoast.com
Leave a Reply