Is Your Site Being Penalized By the Search Engines For Duplicate Content?

Posted on 14th June, 2006

Recently, I noticed one of the most heavily trafficked blog in Malaysia was suffering from a massive lost in traffic within a week period. Later, I found out that another blog with exactly similar content, post by post, might have something to do with the traffic lost suffered by the first blog.

I am not sure who is copying who but as a result of this, the first site was penalized by Google and some of its pages do not show up in Google search results.

It has also been reported that the original blog is using a hidden text to trick the search engine. If this is true this might have also contributed to the penalization.

A post over SEO by the SEA looks at what conditions may cause a search engine not to list pages.

Some duplication of content may mean that pages are filtered at the time of serving of results by search engines, and there is no guarantee as to which version of a page will show in results and which versions won’t. Duplication of content may also mean that some sites and some pages aren’t indexed by search engines at all, or that a search engine crawling program will stop indexing all of the pages of a site because it finds too many copies of the same pages under different URLs.

According to the post, search engines see duplicate content when you have the followings:

  1. Product descriptions from manufacturers, publishers, and producers reproduced by a number of different distributors in large ecommerce sites
  2. Alternative print pages
  3. Pages that reproduce syndicated RSS feeds through a server side script
  4. Canonicalization issues, where a search engine may see the same page as different pages with different URLs
  5. Pages that serve session IDs to search engines, so that they try to crawl and index the same page under different URLs
  6. Pages that serve multiple data variables through URLs, so that they crawl and index the same page under different URLs
  7. Pages that share too many common elements, or where those are very similar from one page to another, including title, meta descriptions, headings, navigation, and text that is shared globally.
  8. Copyright infringement
  9. Use of the same or very similar pages on different subdomains or different country top level domains (TLDs)
  10. Article syndication
  11. Mirrored sites

The post goes into much more details explaining eaach condition and how the search engine see it as content duplication.

If you are having difficulties with the page of your site showing up in search engines or just curious how duplicate content can affect you, check out the full story here:
Duplicate Content Issues and Search Engines



An engineer by training, Victor has been working full-time online as an Internet marketer, a programmer and an app developer since 2001. He has been blogging at Sabahan.com since 2006 sharing his experience and teaching people how to make money online. Click here to join his private Facebook Group for bloggers.

  • Gaman says:

    BTW ShaolinTiger, I doubt Kahsoon is using the subdomain trick. As far as I can see, most of his traffic come from Google and they go to the domain kahsoon.com as I see from his sitemeter stats all this while.

    Check this out to see what I mean:


    As you can see, no subdomain appears in the search result, that’s always the case before Kahsoon.com start losing traffic.

  • Gaman says:

    CyberHacks: I used to believe that too but the search engine is getting more clever these days apparently.

  • CypherHackz says:

    But I’ve read in a site, they said unless we put up the source link, it will not mark as duplicate.

  • Gaman says:

    Katana: Check this out http://www.sabahan.com/2006/06/07/kahsooncom-is-losing-traffic-like-crazy/#comment-1218

    ShaolinTiger: That could be one of the reason but I haven’t came across such subdomain. Can you post a link to such subdomain he is using?

    I believe, Google may have started catching up with the dulicate content also. If you see how extensive the copycat site is, you’ll know what I mean.

    In addition, he uses hidden text, and auto extract the search keywords from search engine query strings and put those keywords on the pages.

  • ShaolinTiger says:

    He’s not down from dupe content I think he’s down because he was using some blackhat subdomain linking schemes and some other dodgy stuff, that’s how he got so much traffic in the first place.

  • Katana says:

    If you can post the link of the copycat site, that would be great. 😛

  • >
    Scroll Up

    Sign Up Below... For Tips and Tools to Help You Build a Better Blog

    >> Sign Up Now and get access to:

    • chevron-circle-right
      The exact methods I use to make money
    • chevron-circle-right
      Resources to increase traffic to your blog
    • chevron-circle-right
      The latest updates from Sabahan.com

    We hate spam. Unsubscribe anytime.