Deal with Blog Scrappers getting indexed quicker than the original site

Although its not something that one should be worried about as such things often happen, and its actually a sign that you are growing. I would suggest you to just keep going on with the quality content on your site and not to worry about them scrapping your articles. Google does a pretty good job in killing spam blogs. They generally gain traction for a month or so and then they are completely gone.

But sometimes it might happen that the spam blog site might be getting indexed quicker than the original site when your original site is pretty much new, so it can be a temporary hold for your organic traffic growth. In such case, we can deal with them by delaying the feeds for a certain amount of time as all these scrappers work by pulling articles from your feeds and then publishing your articles on their site.

Delay publishing of WordPress Feeds:

Here is the snippet with you can delay your feeds for (lets say 15 minutes):

/**
 * Publish the content in the feed 15 minutes later
 * $where ist default-var in WordPress (wp-includes/query.php)
 * This function an a SQL-syntax
 */
function publish_later_on_feed($where)
{
	global $wpdb;
	if ( is_feed() )
	{
		// timestamp in WP-format
		$now = gmdate('Y-m-d H:i:s');
		// value for wait; + device
		$wait = '15'; // integer
		// http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_timestampdiff
		$device = 'MINUTE'; //MINUTE, HOUR, DAY, WEEK, MONTH, YEAR
		// add SQL-sytax to default $where
		$where .= " AND TIMESTAMPDIFF($device, $wpdb->posts.post_date_gmt, '$now') > $wait ";
	}
	return $where;
}
add_filter('posts_where', 'publish_later_on_feed');

This will delay the feeds for 15 minutes (Line 14 in the code) before any new article appears in it. This is a very good approach in killing those automated blogs. But sometimes it can be the case, that they are not automated. Its humans manually copy-pasting the articles from various sources. In such a case, what you can do is to make your blog ping the crawl bots so that your chances of getting indexed first is maximised.

Checklist for fast indexing:

  • Submit a Sitemap to Google Webmasters.
  • Use PushPress and RSS Cloud WordPress plugin.
  • Use WordPress option to ping pinging service and add several multiple pinging service there (less effective now but doing it won’t harm)
  • Delay your feeds for a few minutes (Scrappers won’t be manually monitoring your site every minute)

Hope that helps you defeat those blood sucking scrappers. If you have any questions or tip, feel free to leave it in the comments below.


Comments

5 responses to “Deal with Blog Scrappers getting indexed quicker than the original site”

  1. Is there any way to stop showing up of a particular post in feeds? with some custom field value or category? Help would be appreciated. 🙂

    1. Firstly, your comment and trackback both was caught in spam by Akismet.
      For your question, yes it is possible. You can find limiting feeds by post category on Google easily.

  2. Hi, I was actually pleased when I saw I was getting spam with my articles in, I thought it was OK. Clearly not! Maybe this is why I had a LOT of links and lost 30% for no apparent reason? I’ve still got plenty according to Google, but not so many according to Alexa. Thanks for this post, it covers a topic rarely mentioned and one of those little things that’s just annoying, but not life or death. Thanks!

    1. Hi Peter,
      Happy to help! It was for a friend dealing with the issue, so I though I would do a quick blog post on it.

  3. hi, your site is very nice and informative.. i really like your work.. could u pls tell me that how can i enhance my visitors ? or how can i come first on google search..

    Thanks