Is Your Site Being Penalized By the Search Engines For Duplicate Content?
Posted on 14th June, 2006
Recently, I noticed one of the most heavily trafficked blog in Malaysia was suffering from a massive lost in traffic within a week period. Later, I found out that another blog with exactly similar content, post by post, might have something to do with the traffic lost suffered by the first blog.
I am not sure who is copying who but as a result of this, the first site was penalized by Google and some of its pages do not show up in Google search results.
It has also been reported that the original blog is using a hidden text to trick the search engine. If this is true this might have also contributed to the penalization.
A post over SEO by the SEA looks at what conditions may cause a search engine not to list pages.
Some duplication of content may mean that pages are filtered at the time of serving of results by search engines, and there is no guarantee as to which version of a page will show in results and which versions won’t. Duplication of content may also mean that some sites and some pages aren’t indexed by search engines at all, or that a search engine crawling program will stop indexing all of the pages of a site because it finds too many copies of the same pages under different URLs.
According to the post, search engines see duplicate content when you have the followings:
- Product descriptions from manufacturers, publishers, and producers reproduced by a number of different distributors in large ecommerce sites
- Alternative print pages
- Pages that reproduce syndicated RSS feeds through a server side script
- Canonicalization issues, where a search engine may see the same page as different pages with different URLs
- Pages that serve session IDs to search engines, so that they try to crawl and index the same page under different URLs
- Pages that serve multiple data variables through URLs, so that they crawl and index the same page under different URLs
- Pages that share too many common elements, or where those are very similar from one page to another, including title, meta descriptions, headings, navigation, and text that is shared globally.
- Copyright infringement
- Use of the same or very similar pages on different subdomains or different country top level domains (TLDs)
- Article syndication
- Mirrored sites
The post goes into much more details explaining eaach condition and how the search engine see it as content duplication.
If you are having difficulties with the page of your site showing up in search engines or just curious how duplicate content can affect you, check out the full story here:
Duplicate Content Issues and Search Engines