Wikipedi to compliment or standalone - questions

**Dan Morgan** · 09-06-2004, 03:41 AM

Hi guys,

I have been looking around at the whole Wikipedia thing and there are some questions. Obviously Wikipedia is a good idea to promote existing or a standalone site, but is there any way to automate the content, like some script which can grab an node like in AWS and branch out from there?

Thanks,

Dan

**incka** · 09-06-2004, 05:27 AM

I'm confused about what you are talking about, but to get wikipedia really you'll need to SSH your server and WGET the files.

**Dan Morgan** · 09-29-2004, 07:53 AM

Hmm, it appears I totally missed this - sorry.

Basically, I have seen lots of sites with wiki content. However they are in the same style as wikipedia, in terms of layout, which suggests to me they are using the wiki CMS.

However, these pages are not all the same as the live wikipedia.org pages suggesting they are not feeds but from a db.

Therefore how is this generally done, especially considering sites which concentrate on a niche area so only want a small portion of the overall wiki content. What would be the best way of adding say, 500 pages of wiki content to an existing site, or a standalone site.

Ta,

Dan

**Dan Morgan** · 09-29-2004, 01:58 PM

Well I found - http://download.wikimedia.org/ - not sure how I did not come across that before.

**incka** · 09-29-2004, 11:24 PM

They are too big to download to your hard drive before they update.

**Dan Morgan** · 09-30-2004, 02:24 AM

How often do they update? According to that page, the English current version is 370 odd megabytes, which would take about 50 minutes for me to download.

[edit]
Okay, circa 400 mb file sat on my desktop, but I cannot get an executable from http://sources.redhat.com/bzip2/ to work on XP. Can anyone else?

**tomek** · 09-30-2004, 04:31 AM

Okay, circa 400 mb file sat on my desktop, but I cannot get an executable from http://sources.redhat.com/bzip2/ to work on XP. Can anyone else?

well it says: This executable was built on a Windows 2000 SP 2 machine; I have no idea if it actually works on 95/98/ME/NT/XP.

I run linux so I can't answer your question...

**Dan Morgan** · 09-30-2004, 04:59 AM

Originally Posted by tomek

well it says: This executable was built on a Windows 2000 SP 2 machine; I have no idea if it actually works on 95/98/ME/NT/XP.

I run linux so I can't answer your question...

Yeah, I set the exe to Win 2k compatibility mode and still no joy.

[edit]

Zipzag saves the day. Extracting now...

**tomek** · 09-30-2004, 05:47 AM

on which page to you plan to use the wikipedia data and what do you want to do with it?

**Dan Morgan** · 09-30-2004, 06:32 AM

Originally Posted by tomek

on which page to you plan to use the wikipedia data and what do you want to do with it?

To be honest I have no real plans, just wanted to have a play around and see what was possible.

MySQL balked at the file after about 3 minutes of processing, so I am just trying to circumvent that, plus it do not think it will be possible to just extract the tiny part in relation, of the database.

While I would like a dynamic feed, I am not sure it would be the best port of call at this point.

Dan

**incka** · 09-30-2004, 08:36 AM

You are not gettign this - CUR means UPDATES. OLD means the thing before the updates.

**Dan Morgan** · 09-30-2004, 08:58 AM

Originally Posted by incka

CUR means UPDATES. OLD means the thing before the updates.

You think?

I don't think you are getting what I am getting

**tomek** · 09-30-2004, 09:11 AM

doesn't OLD contain the articles histories i.e. the old versions of it?

**incka** · 09-30-2004, 09:57 AM

Wikipedia has over 1 million articles.

1 MILLION ARTICLES.

400MB = 1000000 ARTICLES?

400MB = 419430400 LETTERS

1000000 articles at 1000 words an article is 1000000000

419430400
1000000000

Maybe CUR is, but its only about 400 letters per article, which seems small to me...

**Westech** · 09-30-2004, 11:01 AM

There are 1 million articles if you count all languages available in wikipedia. Dan said that the English current version is 370-odd MB. The English version has "only" 363,559 articles.

Thread: Wikipedi to compliment or standalone - questions

Thread Tools

Rate This Thread

Wikipedi to compliment or standalone - questions

Bookmarks

Bookmarks

Posting Permissions