I've always wanted to set up my own search engine. I don't mean a site search, I mean a full-blown search engine with its own index of the web. With Nutch out there, it shouldn't be too hard to get started.
I have a bunch of ideas for interesting ways of ranking websites and displaying the results that I'd love to play with. I love complex software algorithms -- there's a patent pending for one I developed for a past employer.
I have access to Drexel U's network, which is largely unpoliced and unrestricted, and I've got about 25mbps of bandwidth with no limits.
I can't spare the hard drive space or CPU time to do this on the three servers I have now, but I have some spare parts in my big box o' junk:
- Athlon XP 2500+ processor and motherboard
- 80GB hard drive
- SB Live! sound card
- GeForce FX5200 graphics card
- 10/100 network adapter
- 500W power supply (picked it up free-after-rebate from radioshack)
If I order a cheap case and some memory, that'll be a complete system.
So here's a question... how much can I realistically accomplish with this? Will 512MB of memory be enough to handle a small search engine or will it die when a couple people are using it at once? How much of the internet can I expect to spider with only 80GB of storage available? Will this index even be large enough to see if any algorithms I can develop have interesting or useful results?
What do you think?
Bookmarks