View Full Version : Copyright violations - search engine ignoring META ROBOTS NOARCHIVE

03-13-2007, 09:17 AM
I discovered yesterday that the "search engine" http://www.hereuare.com is linking to and serving up cached copies of webpages in its search results even when said pages use the META ROBOTS NOARCHIVE instruction. What is particularly interesting is that it appears that they are using DMOZ as the root source for their "index/directory" but not crediting DMOZ. I say this, because I'm observing an uncanny correlation between my pages that are in the HereUare archive and my pages that are linked to by DMOZ. I've also discovered that Google is indexing the archived copies of other people's webpages (this is how I discovered this was happening).

If you are like me and prohibit search engines from serving up cached versions of your webpages, you should do a search to find out how many pages from your site this search engine is caching. You should also send HereUare and their upstream web hosting firm a DMCA cease and desist notice.

HereUare.com's web hosting provider is Above.net and the copyright agent for Above.net is:

Robert Sokota, Esq.
General Counsel
AboveNet Communications, Inc.
360 Hamilton Avenue
White Plains, New York 10601

HereUare should not be allowed to get away with flaunting copyright laws and ignoring the META ROBOTS NOARCHIVE instruction. Maybe if enough of us complain to Above.net we can get them shut down until they clean up their act. :flare:

03-13-2007, 01:40 PM
Why don't you want search engines caching your website?

03-13-2007, 02:56 PM

03-13-2007, 04:21 PM
Why would they scrape the cache? To avoid detection?

03-13-2007, 05:00 PM
Simply put I don't allow SE's to cache my site, I don't need a reason for this. The site in question totally ignores the META ROBOTS NOARCHIVE instruction, which removes any "fair use" copyright defense they might have. In addition, I'm very certain that they grabbed all of the DMOZ directory database and do not provide credit as is required.

The "SE" in question is nothing more than a large scale site scraper that does not respect copyrights disguised as a search engine. They need to be brought to their knees or at least made to comply with the norms of SE behavior in regards to caching webpages and using the DMOZ directory.

03-14-2007, 12:16 PM
So yesterday I went through and compiled a list of URLs of pages the SE in question had cached from my site and built up a list of over 700 URLs that were cached copies of my site. I believe that they are actually caching my entire site ~20,000 pages, but I think the list I compiled was enough to prove willful infringement. I am now sending the list along with a notarized DMCA take down notice to the copyright agent for their web hosting provider.

To think all they needed to do was simply respect the META ROBOTS NOARCHIVE instruction and there would have been no problems. It just floors me that someone who apparently wants to become a "legitimate" SE didn't implement such a simple detail. It really makes you wonder. :confused:

03-16-2007, 10:09 PM
Guys what Ken is saying is that their scraped cached page is actually being indexed and they are getting traffic from copying his pages if I understand right.

This is an old trick that some directories used in the old days, get suckers to submit their meta data, then the directory ends up ranking for those terms many times ahead of the website itself or in place of the website.

Ken I think you should file a major lawsuit, DMCA is strong and you could end up in the news and get some publicity for this claim, with the right law firm maybe you could win a financial settlement too!

03-17-2007, 08:57 AM
Anthony, you understand everything correctly.

I sent the perp an email cease and desist notice and when I didn't hear from them after a couple of days I mailed an DMCA take down notice to their network provider (the owner of their IP address). I included a 16 page table of urls for infringing copies and their corresponding page on my site. I listed around 700 offending URLs, but I suspect the number was closer to 20,000 - 40,000 offending pages.

I now have to wait and see what happens.

03-17-2007, 09:25 AM
I would find an IP law firm and file a lawsuit against the company and the host Ken, you need to do this ASAP.

If you need a law firm that handles this sort of case let me know!

03-17-2007, 10:25 AM
If I need to go as far as bringing in an IP law firm, I will, but normally I follow the practice of issuing a cease and desist notice to the offender, then if I get no response a DMCA take down notice to their provider. I've never had to proceed beyond a DMCA take down notice.