Duplicate content

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Duplicate content is a term used in the field of search engine optimization to describe content that appears on more than one web page. The duplicate content can be substantial parts of the content within or across domains and can be either exactly duplicate or closely similar.[1] When multiple pages within a web site contain essentially the same content, search engines such as Google can penalize or cease displaying that site in any relevant search results.


Non-malicious duplicate content may include variations of the same page, such as versions optimized for normal HTML, mobile devices, or printer-friendliness, or store items that can be shown via multiple distinct URLs.[2] Duplicate content issues can also arise when a site is accessible under multiple subdomains, such as with or without the "www." or where sites fail to handle the trailing slash of URLs correctly.[3]

Malicious duplicate content refers to content that is intentionally duplicated in an effort to manipulate search results and gain more traffic. This is known as search spam. There are number of tools [4] available to verify the uniqueness of the content.

A 301 Moved Permanently a.k.a. a "301 redirect" is a method of dealing with duplicate content to redirect users and search engine crawlers to the single pertinent version of the content.[2]


Ideally you want to prevent duplicate content. Sometimes you just can't - canonical URLs will help. A good example are online shops which feature items in different colors ( shoe red, shoe black ). Canonical URLs will tell search engines not to treat every URL but rather use a baseurl and index that one. You can even use serverside configurations as well as hardcoded tweaks[5] to help you with duplicate content.

See also[edit]