Looks like Google might have finally found my new Indy Racing Collectibles site. I have a whopping 42 pages indexed as of right now.
However, when I do the following Google search:
indy site:www.indy-racing-collectibles.com
I'm seeing pages listed that I specifically told it not to index.
The interesting thing is that the pages I did not want indexed show up without any title, description, url, or cached info. Does this just mean that Google knows about them and adds the url to the index but would never display them to someone in search results?
The reason I'm even concerned about this is because this is an AWS site, and I've taken pains to try to keep it somewhat "on topic" to start with until I can better gauge the impact of allowing search engines to run wild over millions of possible urls. If Google ignores the robots.txt file and crawls them anyway, then I might as well just open it up and let them have their way with me.
Bookmarks