Earlier this month I wrote as part of our 2011 SEO predictions about how search engines were likely to take an even more aggressive stance towards duplicate and low value content this year. On Friday Matt Cutts announced an algorithm update that does just that. The update, which somewhat unusually for a Google algorithm change doesn't yet appear to have been christened with a catchy moniker, tightens duplicate content filters and tries to make a better job of identifying the original source of duplicate material:
My [previous] post mentioned that "we're evaluating multiple changes that should help drive spam levels even lower, including one change that primarily affects sites that copy others' content and sites with low levels of original content." That change was approved at our weekly quality launch meeting last Thursday and launched earlier this week.
This will be welcome news to many Google users but, as ever when Google makes a relatively big change to it's algo, there have been reports of "collateral damage" where the change has affected sites whose owners feel Google has mistakenly identified their site as having duplicate or low value content.
How are your rankings?
If you've noticed any significant changes in your rankings since Monday that haven't recovered by the next few days (suggesting they aren't normal temporary ranking fluctuations) it's possible that you've been affected by the update, either because your content isn't entirely original (in which case you'll need to take steps to correct this) or because someone else has copied your content and Google has mistakenly assumed their site is the original source (in which case your appeal options are limited, and it's often easier to change your content anyway).
To check, identify your affected pages and then search Google for some short snippets of text from the pages, encapsulated in quotes. If your page doesn't rank first, or at all, then that's a strong signal that the problem could be duplicate content related.
This problem (low value and duplicate content) is one of such perpetual nature that the search engines are sure to revisit it again in the future. The best way to insulate against any future steps the engines may take against duplicate content is to ensure your pages:
- ...have a sufficient amount of original text content, supported by images, videos and other multimedia as appropriate.
- ...are rapidly indexed by the search engines. To achieve this your site should be regularly linked to, necessitating some kind of link acquisition strategy, and new pages should be submitted to the engines via XML sitemaps and featured on the homepage or another highly authorative hub page in your site (such as a category homepage) until they've been indexed. If you have a blog, make sure it pings the search engines when a new post is published (most do), and then use the blog to publish or link to new content on your site.
- ...are linked to and/or cited directly by third party sites. Since it's rarely practical or economical to actively link build for every page in a site you often need to consider why someone would naturally link to your pages or share them on Twitter, for example. If you can't think of a good reason you may need to go back to the drawing board.