View Full Version : Should I use datafeed?

11-07-2003, 06:21 AM
Hi Guys,

Just wondering if its a good idea to use datafeeds. I have a shopping mall, with many categories, books, video games, etc etc etc

Now I have feeds to some of those categories. Say for example I have the feed for videogames, In my video games themed page, should I link it to the datafeed which can produce an additional 20,000 pages of games?

Lets say I do it to some of the other categoires as well, what would having 100,000 pages means to my site?



11-07-2003, 06:42 AM

That many pages will get you more search engine traffic and raise your PageRank (so long as you atleast have some unique text on each page).

Large sites definitely have an advantage over smaller ones.

11-07-2003, 06:48 AM

Thanks Chris, appreciate it.


11-07-2003, 07:56 AM
I was interested to find this thread, as I came here to ask some questions related to sites powered by dynamic data feeds.

I'm building a site based on Amazon Web Services, focused on a particular niche of interest to me. After figuring out how to return results dynamically and going through the process of making my links as search-engine friendly as possible (using some functionality similar in concept to mod_rewrite), I realized that I could be creating a monster and it has me a bit concerned.

With all of the cross-referencing and similar product possibilities, search engines could potentially find millions of pages, costing me a lot of money in bandwidth. The conundrum here is that I would like search engines to crawl and find all the pages related to my topic, but not find a whole bunch of unrelated stuff (I wouldn't mind getting commissions on other stuff but I have no interest in making my site a big multi-purpose shopping mall).

Anyone else have this issue? Do you have any strategies to effectively limit the scope of search engine crawls? I'm not sure robots.txt is going to do it in this case.

11-07-2003, 08:02 AM
Well the biggest bandwidth issue is going to be images -- if you can keep images off your server (just hotlink the ones on Amazon) that should help there.

Then you could implement mod_gzip to cut down on overall bandwidth usage.

Still, millions of pages is a bit much. I would limit yourself to your topic. I'm sure you can edit whatever script you are using so that you only show cell phones or whatever it is you want to sell, rather than the millions of books.

11-07-2003, 08:15 AM
Thanks for the quick reply.

By the way, I'm using ASP on a shared server running IIS, so I'm not sure there's a mod_gzip equivalent I can use, but maybe.

Part of the problem is that I'm not selling just one type of product. It's really a variety of products related to a particular topic. With this topic, there are relevant books, videos, etc.

My script uses generic codes in the url to indicate what type of search is required (title, author, publisher, etc.), what product type, what keywords, etc. So as it stands right now, I don't see any easy way to restrict the scope of automated crawlers, unless I tell them not to crawl similar items, related categories, etc. But the problem is that I want them to do some of that as long as they are related to my topic.

My pages are already pretty light weight, but perhaps I could look into making them lighter, and perhaps I could serve the few non-Amazon images from another account somewhere. I may just have to go for it and see what happens.

11-07-2003, 08:49 AM
Well maybe you could set all products up for moderation. Your script finds the products but then you need to go through and approve each one. At first it'd be time consuming, but it would be one sure-fire way to cut size down.

11-07-2003, 08:55 AM
Hi flyingpylon.

I was working on AWS last night. This might work:

Let's say you have 6 featured products spanning 2 categories, and numerous "related" products spanning 12 categories.

For every category a "related" product falls under which is not a "main" category, you could simply not display the related products.

That way, you would not be displaying every single product for every single category. Instead, you would be displaying every single product only in categories where your "featured" products reside. Let me know if that works.

Here's what I was wrangling with on Amazon Web Services, in the DVDs category. Take a look (no design yet)-


11-07-2003, 09:42 AM
Well, after some more thought I've decided that I probably shouldn't just "see what happens". There are other people using a mod_rewrite solution that are running into the same issues, getting kicked out by their hosts, etc. and I really have no interest in that. The difference is that they seem to have implemented their script with different intentions (they wanted to get creamed by search engines). I just want my topic covered.

I think that manual categorization (or moderation as Chris called it) is the only true solution. It would be a lot of work up front, but then you'd only have to add new products as they came along. I can envision a script that would display the products with checkboxes or something so that I could check them off to be added to a database.

The other thing though is that when I display an item, I also hyperlink the author and publisher fields to searches for those terms, as well as the similar items and categories. This is a useful user tool that I don't want to lose, so perhaps I could just put those links in some javascript so that they'd be invisible to search engines. Or maybe I could add a false folder name to the beginning of the url for those searches and then exclude them in robots.txt.

GCT, I think I understand your concept but I'm not sure exactly how I would implement it. I'll have to do some more thinking about that.

At any rate, it's a perplexing problem - too much of a good thing I guess. And it was almost too easy to write the scripts that created the situation. I'm just glad I realized it before I launched the site.

05-19-2004, 02:32 PM
...links as search-engine friendly as possible (using some functionality similar in concept to mod_rewrite)...


How did you do the search-engine friendly URLs on IIS?

05-19-2004, 02:51 PM
I redirect all file not found requests to an ASP page that parses the URL requested. It first parses the domain name and looks it up in an array stored in an application variable to determine which site the request is for. Then it parses the string between the / after the domain name and the next / and again references an array in an application variable to determine which section of the site the request is for. Then, if there is a string at the end of the request that contains .htm, it looks in a database table to see if there is a match for the string just prior to the .htm in that section, in that site. If so, it executes the appropriate stored procedure to deliver the content. Alternatively, if it's determined that the section in question needs more dynamic results, it does a Server.Transfer to an ASP page which gets processed and delivers the appropriate content. If necessary, I do additional URL parsing in that ASP page using a function I wrote.

That's it in a nutshell. It's actually pretty complicated with all the other stuff I have it doing.