Meta Robot Tags

Meta Robot Tags
The Google bot, and many other web robots, can be instructed not to index specific pages (rather than entire directories), not to follow
links on a specific page, and to index, but not cache, a specific page, all via the HTML meta tag, placed inside of the head tag.
This document was created by an unregistered ChmMagic, please go to http://www.bisenter.com to register it. Thanks .
Google maintains a cache of documents it has indexed. The Google search results provide a link to the
cached version in addition to the version on the Web. The cached version can be useful when the Web
version has changed and also because the cached version highlights the search terms (so you can
easily find them).
The meta tag used to block a robot has two attributes: name and content. The name attribute is the name of the bot you are excluding. To exclude
all robots, you'd include the attribute name="robots" in the meta tag.
To exclude a specific robot, the robot's identifier is used. The Googlebot 's identifier is googlebot, and it is excluded by using the attribute
name="googlebot". You can find the entire database of excludable robots and their identifiers (currently 298 with more swinging into action all the
time) at http://www.robotstxt.org /wc/active/html/index.html.
The 298 robots in the official database are the tip of the iceberg. There are many more unidentified bots
out there searching the Web.
The possible values of the content attribute are shown in Table 3-1. You can use multiple attribute values, separated by commas, but you
should not use contradictory attribute values together (such as content="follow, nofollow").

For example, you can block Google from indexing a page, following links on a page, or caching the page using this meta tag:
meta content="noindex, nofollow, noarchive" name="googlebot"
More generally, the following tag tells legitimate bots (including the Googlebot) not to index a page or follow any of the links on the page:
meta content="noindex, nofollow" name="robots"
There's no syntax for generally stopping a search engine from caching a page because the noarchive
attribute only works with the Googlebot.




For more information about Google's page-specific tags that exclude bots, and about the Googlebot in general, see
http://www.google.com/bot.html.



Meta Information
Meta information, sometimes called meta tags for short, is a mechanism you can use to provide information about a web page.
The term derives from the Greek word meta, which means "behind" or "hidden." "Meta" refers to the
aspect of something that is not immediately visible, perhaps because it is in the background, but which
is there nonetheless and has an impact.
The most common meta tags provide a description and keywords for telling a search engine what your web site and pages are all about.
Each meta tag begins with a name attribute that says what the meta tag represents. The meta tag:

means that this tag will provide descriptive information. The meta tag:

means that the tag will provide keywords.
The description and keywords go within a content attribute in the meta tag. For example, here's a meta description tag (often simply called the
meta description):

Keywords are provided in a comma-delimited list. For example:

More About Meta Tags
Meta tags can contain a lot more than just descriptions and keywords, including (but not limited to) a technical description
of the kind of content on a page and even the character encoding used:


Additionally, you've already seen how meta tags can instruct search engine bots on what to index in "Meta Robot Tags,"
earlier in this chapter.
It's easy for anyone to put any meta tag keywords and description they'd like in a page's HTML code. This has lead to abuse when the
meta tag information does not really reflect page content. Therefore, meta tag keyword and description information is deprecated by
search engine indexing software and not as heavily relied upon by search engines as it used to be. But it is still worth getting your meta tag
keywords and descriptions right.


Google will try to pick up page descriptions from text towards the beginning of a page, but if this is not
availablefor example, because the page consists of graphics and has no textit will look at the
information provided in the content attribute of a meta description.
Meta keywords should be limited to a dozen or so terms. Don't load up the proverbial kitchen sink. Think hard about the keywords that
you'd like to lead to your site when visitors search


For the keywords that are really significant to your site, you should include both single and plural forms, as well as any variants. For
example, a site about photography might well want to include both "photograph" and "photography" as meta tags.
If you want to include a phrase containing more than one term in your keyword list, quote it. For
example: "digital photography." However, there is not much point in including a compound term if the
words in the phrase ("digital" and "photography") are already included as keywords.