PDA

View Full Version : Wikipedi to compliment or standalone - questions



Dan Morgan
09-06-2004, 03:41 AM
Hi guys,

I have been looking around at the whole Wikipedia thing and there are some questions. Obviously Wikipedia is a good idea to promote existing or a standalone site, but is there any way to automate the content, like some script which can grab an node like in AWS and branch out from there?

Thanks,

Dan

incka
09-06-2004, 05:27 AM
I'm confused about what you are talking about, but to get wikipedia really you'll need to SSH your server and WGET the files.

Dan Morgan
09-29-2004, 07:53 AM
Hmm, it appears I totally missed this - sorry.

Basically, I have seen lots of sites with wiki content. However they are in the same style as wikipedia, in terms of layout, which suggests to me they are using the wiki CMS.

However, these pages are not all the same as the live wikipedia.org pages suggesting they are not feeds but from a db.

Therefore how is this generally done, especially considering sites which concentrate on a niche area so only want a small portion of the overall wiki content. What would be the best way of adding say, 500 pages of wiki content to an existing site, or a standalone site.

Ta,

Dan

Dan Morgan
09-29-2004, 01:58 PM
Well I found - http://download.wikimedia.org/ - not sure how I did not come across that before.

incka
09-29-2004, 11:24 PM
They are too big to download to your hard drive before they update.

Dan Morgan
09-30-2004, 02:24 AM
How often do they update? According to that page, the English current version is 370 odd megabytes, which would take about 50 minutes for me to download.

[edit]
Okay, circa 400 mb file sat on my desktop, but I cannot get an executable from http://sources.redhat.com/bzip2/ to work on XP. Can anyone else?

tomek
09-30-2004, 04:31 AM
Okay, circa 400 mb file sat on my desktop, but I cannot get an executable from http://sources.redhat.com/bzip2/ to work on XP. Can anyone else?

well it says: This executable was built on a Windows 2000 SP 2 machine; I have no idea if it actually works on 95/98/ME/NT/XP.

I run linux so I can't answer your question...

Dan Morgan
09-30-2004, 04:59 AM
well it says: This executable was built on a Windows 2000 SP 2 machine; I have no idea if it actually works on 95/98/ME/NT/XP.

I run linux so I can't answer your question...

Yeah, I set the exe to Win 2k compatibility mode and still no joy.

[edit]

Zipzag saves the day. Extracting now...

tomek
09-30-2004, 05:47 AM
on which page to you plan to use the wikipedia data and what do you want to do with it?

Dan Morgan
09-30-2004, 06:32 AM
on which page to you plan to use the wikipedia data and what do you want to do with it?

To be honest I have no real plans, just wanted to have a play around and see what was possible.

MySQL balked at the file after about 3 minutes of processing, so I am just trying to circumvent that, plus it do not think it will be possible to just extract the tiny part in relation, of the database.

While I would like a dynamic feed, I am not sure it would be the best port of call at this point.

Dan

incka
09-30-2004, 08:36 AM
You are not gettign this - CUR means UPDATES. OLD means the thing before the updates.

Dan Morgan
09-30-2004, 08:58 AM
CUR means UPDATES. OLD means the thing before the updates.

You think? :rolleyes:

I don't think you are getting what I am getting ;)

tomek
09-30-2004, 09:11 AM
doesn't OLD contain the articles histories i.e. the old versions of it?

incka
09-30-2004, 09:57 AM
Wikipedia has over 1 million articles.

1 MILLION ARTICLES.

400MB = 1000000 ARTICLES?

400MB = 419430400 LETTERS

1000000 articles at 1000 words an article is 1000000000

419430400
1000000000

Maybe CUR is, but its only about 400 letters per article, which seems small to me...

Westech
09-30-2004, 11:01 AM
There are 1 million articles (http://en.wikipedia.org/wikistats/EN/ChartsWikipediaZZ.htm#2) if you count all languages available in wikipedia. Dan said that the English current version is 370-odd MB. The English version has "only" 363,559 articles (http://en.wikipedia.org/wikistats/EN/ChartsWikipediaEN.htm#2).

Kyle
09-30-2004, 11:19 AM
I would recommend using wikipedia to compliment an existing site.

If you have a site that sells sunscreen for example, you could use wikipedia articles on ultra violet rays, and sunburn.

If you have a site that sells books, you could use wikipedia biographies of famous authors.

incka
09-30-2004, 10:15 PM
Cool.

Still a problem though - PHPMyAdmin's max size allowed is 50mb.