PDA

View Full Version : Scraping, Duplicate Content and Rankings



chromate
10-10-2010, 05:40 AM
Hi All,

I haven't been around for agggggeees. Hope everyone's good.

So... A site I've owned for years and years, www.carb-counter.org, dropped out of the search engine rankings a couple of weeks back, having been #1 for several years for the term "carb counter". It didn't really earn that much, so I wasn't really too bothered, but obviously earning something's better than nothing.

Today I did another check and find that at #1 is a site called adout.org. It's an ad blocker site that basically dynamically scrapes content and republishes it under their own URLs. They scraped my site.

As they scrape the content dynamically instead of storing it and republishing from a database, I've banned their IP from carb-counter.org. It now says they can't process any of the URLs from my site (good).

I'm guessing I got some kind of duplicate content penalty, due to the scraped content? Question is, why was it my site that was banned and not adout.org? Second question - how on earth did adout.org shoot straight to #1 for the term "carb counter" simply by using my content?!

It's not just me, right? Adout is at #1 for "carb counter" at the moment?

I'm hoping now I've effectively removed their duplicate content, my site will eventually return to its former position.

Is there anything else I need to do?

Cheers, Rich

Blue Cat Buxton
10-11-2010, 09:01 AM
It's not just me, right? Adout is at #1 for "carb counter" at the moment?

No I see that too. Google should be smart enough to not rank sites like these. I hope you get your position back soon.

Chris
10-12-2010, 04:56 AM
They've also effectively broken copyright law. I would have send legal notices prior to banning their IP. You could have easily had the entire site taken down (assuming it is hosted in a civilized country).

Share their IP so we can all ban it too?

And it might not be too late to send Google a DMCA notice to get them removed from the serps.

Chris
10-12-2010, 05:00 AM
Here you go:

http://www.singlehop.com/legal/legal-complete.php#dmca

The site's ip is owned by this company. 173.236.27.154 - this is the site IP though, it might not be the scraper ip.

chromate
10-13-2010, 04:41 AM
Thanks guys. I'll look into the DMCA notices. I can always lift the ban I've put on their IP so they're publishing my content again, and then file the DMCA with Google... Maybe the would be a quicker way to achieve the desired result - getting my site back where it should be, if they then get banned?

I did think about changing the content that gets served to Adout to simply a link to my carb counter site, thereby gaining some backlinks? But I thought it might not be a good idea to start playing games like this, as far as Google is concerned.

The main thing I really don't get is how they managed to shoot to #1 anyway.

Their page doesn't seem to have any backlinks. I think my site has around 1,000 backlinks and is about 7 years old. Yet they republish my content - my site gets banned and suddenly they're at #1! It's not exactly an uncompetitive keyword phrase. It's just weird, and I don't know how Google have figured this out?

The scraper IP is the same as you posted, Chris.

Chris
10-13-2010, 01:00 PM
Your idea of removing the ban then firing off the DMCA should work to get that page removed from Google.

After its removed from Google, you could also send it to the host to get the guy to shut down this very illegal service. He is a US citizen too. I don't know if he realizes it (I doubt it) but if he republishes any content that is actually registered with the US copyright office he can be sued and the minimum he will have to pay is $25,000 per page.

Its no joke. Eventually he will be caught doing it to a major publisher and lose his business, his house, etc etc. The website is registered to him personally, not even a business, so all his personal assets (if any) would be fair game.

cooluks
11-01-2010, 11:00 PM
You easily can tell whether it is a scrape content or not. Try using the whois tool. It will detect when did the original create/made it (contents). If you are really the right owner you have the right to sue on them. :D

chromate
12-14-2010, 03:16 AM
cooluks, thanks, but I know the content has been scraped (It's my entire site! lol ;))

UPDATE: Unbelievably AdOut.org are still ranked at position 2 for 'carb counter'!

I'm just absolutely amazed Google can be duped so easily in this day and age. It would almost make an interesting case study for an SEO somewhere as to how this can happen. AdOut.org was only registered in Feb 2010... My site's been around since Feb 2004 and has had pretty much the same content since it was born. And then, even when the scraped content has been removed (which it has for over a month now!) the offending site is still ranked well, and my site is still in penalty land. :mad:

I can only imagine that they're still ranked well, because they have the text "Unable [to scrape!] carb-counter.org" and so have the keyword in there. But even so, they shouldn't be ranked there. I just do not understand it.

I looked at filing a DMCA with Google. But they don't remove the offending site as such, they just replace the page with a notice that a DMCA has been filed against the site.

What I really need is for Google to realise how ridiculous this ranking is, remove my penalty, and set things straight. Of course, I don't expect this to happen.

I wonder if I should return to my original idea of serving AdOut.org pages from my site again - but they would just be a site description and a plain link to carb-counter.org. Do you think Google would then get the message?... "Hmm... carb-counter.org got a penalty for scraping AdOut.org, but hold on, AdOut.org are doing nothing but linking to carb-counter.org?! Hmmm"

I don't really want to be playing games like this, but it's simple, and might just work? Could it really make things any worse if AdOut link to my site? I suppose this would be a form of cloaking, but only to AdOut, so I don't see that as being a big deal, and Google wouldn't be able to detect that anyway?

Perhaps it's worth a shot?

chromate
12-14-2010, 11:10 AM
hmm, AdOut.org seems to have disappeared from the SERPs now.

Chris
12-14-2010, 01:03 PM
Coincidence? Or someone read your post here? Probably coincidence. Congratulations though.