WHAT IS "CLOAKING"? - SEO Agency

After a search on Google, you click on the title of a site and you land on a page that does not contain any of the words you used in your search. Has Google lost its mind? Chances are, it’s more like cloaking!

Cloacking : definition

For the same url address, two Internet users can see completely different pages. This is the phenomenon of cloaking.

There are various ‘honest’ reasons for cloaking:

Present visitors with a web page written in their native language
Adapt the format of the page according to the used browser (e.g. Chrome or Mozilla)
Display targeted ads
Protecting a page’s keywords from its competitors
Prohibit access to a page by email bots

The most common “dishonest” reason is to lure search engine crawlers with a page whose content does not match the content seen by human visitors.

Although this black hat seo technique has enabled some sites to obtain better positioning on a large number of occasions, we formally advise you against using it. Search engines are developing increasingly effective tools to detect sites practicing “dishonest” cloaking and the distribution of penalties has already begun. It is now a radical way to end up in Google’s sandbox.

How does cloacking work?

When you visit a web page, your browser (Google Chrome, Mozilla Firefox, Brave,…) starts by sending an HTTP request of this type to the site server:

GET: /indexfr.php
HOST: www.seo.fr
USER_AGENT=Mozilla/4.0 (compatible; MSIE 6.0; Windows XP; FREE)
REFERRER=http://www.google.com/search?hl=fr&q=analyse+d%27audience&lr=
REMOTE_ADDR=255.64.12.01

GET contains the name of the web page you want to see (in this example index.php)
HOST contains the name of the site where this page is located
USER_AGENT is the “signature” of your browser (in this example Internet Explorer 6, under windows XP, connected by free.fr)
REFERRER is the site from which you came (in this example, you launched a search on Google before accessing the seo.fr site)
REMOTE_ADDR is your IP address

As soon as it receives this HTTP request, the site’s server loads (for HTML sites) or generates (for PHP sites) the page you request, then sends it back to your browser, which displays it: you can then read it.

If the IP address makes it possible to determine in which country the visitor is located, it also makes it possible to recognise an indexing robot (some sites keep an updated list of the IPs of the main indexing robots to facilitate their detection by the cloakers). All that remains is to generate pages full of keywords to present to the robots and make them believe that the content of the site visited is extraordinarily rich.

It goes without saying that this technique is extremely penalising for visitors: you search for “cars” and you come across an adult site that has nothing to do with cars. It is therefore logical that search engines are increasingly trying to protect themselves from this.

Javascript/Flash/DHTML variants

Another way to practice cloacking is to use one of the current limitations of the engines: they seem to have difficulties in understanding Javascript, Flash or DHTML codes.

All you have to do is make a page full of keywords that includes a redirect in one of these languages to a “cleaner” page. Browsers will follow the redirect and show users the “clean” page, while search engines will index the keyword-filled page and show it in their results.

This technique, which is part of what is known as “satellite pages”, is simple to implement, but represents pure cheating which cannot claim to have any other objective than the desire to deceive search engines (and therefore its users).

The site of the BMW company was blacklisted at the beginning of 2006 because of a practice of this type. It is possible that Google now takes the time to interpret all or part of the Javascript code it finds in the pages it examines.

The other existing variants

White text on a white background, or black text on a black background, will be invisible to your visitors, but will still be taken into account by certain engines. This text can be a list of keywords intended to increase the relevance of your pages on certain terms. This technique is one of the oldest of its kind and seems to be easily identified (and punished) by most engines today. It also has the advantage (or disadvantage) of being easy to detect by a clever Internet user: you just have to select with the mouse the zones that seem suspicious to see the texts hidden by this means appear. Suspicious white zone? Try selecting the paragraph you are reading!

Another technique is the use of frames. In a page of this type, it is indeed possible to define a frame with microscopic dimensions or located “off screen” which will be invisible to Internet users.

CSS also allows you to position one block on top of another and thus gives you the possibility to hide part of the page content from human visitors. Again, this invisible content can be a list of keywords designed to make your page more “interesting” to the engines.

Are “cloakers” vigilantes?

Insofar as search engines are “stupid” bots that do not always see pages in the same way as humans, it is easy to accuse them of “unfairly” ranking the sites they list. Cloaking would therefore only aim to put things in their place and repair some of these injustices. This pretext, hypocritical to say the least, is invoked by some cloakers to justify their action.

Admittedly, Google and the other engines do not yet have the status of “official judges of the Web”, and no law (apart from their own) prohibits them from being trapped. But to claim to be carrying out a more or less “moral” action by playing this little game is, in our opinion, a real “screwing of Google”.

Why and how do you get caught?

Google has set up a whistleblower service for webmasters who feel they have been wronged by the practices of some of their competitors or for users who feel they have been misled by the content of certain sites.

It is not known exactly how the complaints registered by this service are dealt with, but in some cases they have led to the penalisation of sites that practise cloaking.

There is also no doubt that Google has set up “cloaked bots”, which crawl the web pretending to be human (there are no real technological barriers to this practice). It is enough to cross-check the results of this type of exploration with those of “classic” bots to detect any form of cloaking. However, this form of crawling is slower than the usual one, so it will probably take some time to find cloaked sites – hidden among the 8 billion pages in Google’s index.

What is “cloaking”?