|
Page
Cloaking - To Cloak or Not to Cloak
Teacher: Sumantra
Roy
Page cloaking can broadly be
defined as a technique used to deliver different web pages
under different circumstances. There are two primary reasons
that people use page cloaking:
i) It allows them to create a
separate optimized page for each search engine and another
page which is aesthetically pleasing and designed for their
human visitors. When a search engine spider visits a site,
the page which has been optimized for that search engine is
delivered to it. When a human visits a site, the page which
was designed for the human visitors is shown. The primary benefit
of doing this is that the human visitors don't need to be shown
the pages which have been optimized for the search engines,
because the pages which are meant for the search engines may
not be aesthetically pleasing, and may contain an over-repetition
of keywords.
ii) It allows them to hide the
source code of the optimized pages that they have created,
and hence prevents their competitors from being able to copy
the source code.
Page cloaking is implemented
by using some specialized cloaking scripts. A cloaking script
is installed on the server, which detects whether it is a search
engine or a human being that is requesting a page. If a search
engine is requesting a page, the cloaking script delivers the
page which has been optimized for that search engine. If a
human being is requesting the page, the cloaking script delivers
the page which has been designed for humans.
There are two primary ways by
which the cloaking script can detect whether a search engine
or a human being is visiting a site:
i) The first and simplest way
is by checking the User-Agent variable. Each time anyone (be
it a search engine spider or a browser being operated by a
human) requests a page from a site, it reports an User-Agent
name to the site. Generally, if a search engine spider requests
a page, the User-Agent variable contains the name of the search
engine. Hence, if the cloaking script detects that the User-Agent
variable contains a name of a search engine, it delivers the
page which has been optimized for that search engine. If the
cloaking script does not detect the name of a search engine
in the User-Agent variable, it assumes that the request has
been made by a human being and delivers the page which was
designed for human beings.
However, while this is the simplest
way to implement a cloaking script, it is also the least safe.
It is pretty easy to fake the User-Agent variable, and hence,
someone who wants to see the optimized pages that are being
delivered to different search engines can easily do so.
ii) The second and more complicated
way is to use I.P. (Internet Protocol) based cloaking. This
involves the use of an I.P. database which contains a list
of the I.P. addresses of all known search engine spiders. When
a visitor (a search engine or a human) requests a page, the
cloaking script checks the I.P. address of the visitor. If
the I.P. address is present in the I.P. database, the cloaking
script knows that the visitor is a search engine and delivers
the page optimized for that search engine. If the I.P. address
is not present in the I.P. database, the cloaking script assumes
that a human has requested the page, and delivers the page
which is meant for human visitors.
Although more complicated than
User-Agent based cloaking, I.P. based cloaking is more reliable
and safe because it is very difficult to fake I.P. addresses.
Now that you have an idea of
what cloaking is all about and how it is implemented, the question
arises as to whether you should use page cloaking. The one
word answer is "NO". The reason is simple: the search engines
don't like it, and will probably ban your site from their index
if they find out that your site uses cloaking. The reason that
the search engines don't like page cloaking is that it prevents
them from being able to spider the same page that their visitors
are going to see. And if the search engines are prevented from
doing so, they cannot be confident of delivering relevant results
to their users. In the past, many people have created optimized
pages for some highly popular keywords and then used page cloaking
to take people to their real sites which had nothing to do
with those keywords. If the search engines allowed this to
happen, they would suffer because their users would abandon
them and go to another search engine which produced more relevant
results.
Of course, a question arises
as to how a search engine can detect whether or not a site
uses page cloaking. There are three ways by which it can do
so:
i) If the site uses User-Agent
cloaking, the search engines can simply send a spider to a
site which does not report the name of the search engine in
the User-Agent variable. If the search engine sees that the
page delivered to this spider is different from the page which
is delivered to a spider which reports the name of the search
engine in the User-Agent variable, it knows that the site has
used page cloaking.
ii) If the site uses I.P. based
cloaking, the search engines can send a spider from a different
I.P. address than any I.P. address which it has used previously.
Since this is a new I.P. address, the I.P. database that is
used for cloaking will not contain this address. If the search
engine detects that the page delivered to the spider with the
new I.P. address is different from the page that is delivered
to a spider with a known I.P. address, it knows that the site
has used page cloaking.
iii) A human representative from
a search engine may visit a site to see whether it uses cloaking.
If she sees that the page which is delivered to her is different
from the one being delivered to the search engine spider, she
knows that the site uses cloaking.
Hence, when it comes to page
cloaking, my advice is simple: don't even think about using
it.
About the teacher:
Sumantra is one
of the most respected search engine positioning specialists on
the Internet. To have Sumantra's company place your site at the
top of the search engines, go to http://www.1stSearchRanking.com/ For
more advice on how you can take your web site to the top of the
search engines, subscribe to his FREE newsletter by going to http://www.1stSearchRanking.com/newsletter.htm
|