PDA

View Full Version : Dealing with Google and old content



mobilebadboy
05-29-2004, 05:07 PM
Ok I've waited more than a month and 1/2 but Google still continues to list old content, content that the URLs or directories haven't existed since the middle of March. I've got an ErrorDocument setup and my 404 spits out a 404 header, and then redirects to my front page.

Apparently Google is ignoring the 404 header and seeing the live page (being my front page) as the URL still existing. I've tried the automatic removal section on their site, and that's the message I get "this page still exits on the web".

Here's what I have in my 404:



<?
header("HTTP/1.0 404 Not Found");
header('Location:http://visualintensity.com');
?>

What can I do to get Google to realize these old URLs it continues to list aren't there? I thought Google would recognize the first line while the redirect still let people get to a live page on my site. I don't want people just ending up on a regular 404 page and being stuck.

Not only that, Google continues to ignore my robots.txt and index directories and files I Disallow in my robots txt.

mobilebadboy
05-29-2004, 10:23 PM
Well it seems Google's automatic removal system pays attention when you link the robots.txt file for it to look at. I'm unsure if it'll "stick" but within hours it's un-indexed all the URLs I wanted removed. I hope it stays that way. I just resubmitted to clean up a few last URLs.

I'm having a hard time getting a few swf's removed though. robots.txt allows for wildcards, but for some reason their automatic removal system rejects wildcards. Seems odd when they promote robots.txt so much. You'd think they'd write their system to accept any included Disallows.

Chris
05-30-2004, 06:33 AM
Just move the swfs to a directory you can ban.