PDA

View Full Version : Using Wiki content



r2d2
10-29-2004, 05:00 AM
When you guys are using Wikipedia content, do you copy each piece manually? Or is there some ready-made method of automating the copying process?

The New Guy
10-29-2004, 05:12 AM
I would like to know this also.

incka
10-29-2004, 05:16 AM
download.wikimedia.org

But the data needs alot of editing...

Nintendo has alot on www.public-domain-content.com too...

The New Guy
10-29-2004, 05:19 AM
woudnt it be easier to just copy paste?

r2d2
10-29-2004, 05:22 AM
Hehe, thats almost so perfect its funny :) Cheers Incka.

Copy paste would get a bit boring after five pages or so. Plus you would have to add it to a database etc, or create separate pages for each article. I was planning on a few hundred pages or so.

incka
10-29-2004, 06:24 AM
If you get that wikipedia download (cur version is what you want) working on your site, I will pay you to install it on mine.

r2d2
10-29-2004, 08:14 AM
Hmmm. So am I right thinking you can only download the entire database for each language? English version is running at the moment - 2% of 404Mb!

So I take it everyone else has just copied the stuff they want manually before then?

r2d2
10-29-2004, 10:34 AM
Just unzipping it now, and its up to 1.2Gb and counting....

Not sure my little local server is gonna like me soon :)

r2d2
10-29-2004, 11:18 AM
Came out at about 1.6Gb, and mySQL didnt want any SQL files bigger than 2Mb :)

I think the XML Special:Export way is the way I am going to go.

incka
10-29-2004, 11:27 AM
That XML thing, will it get the whole of the encyclopedia, cause if it can I will pay you to get it onto my server and working on php...

Westech
10-29-2004, 11:40 AM
It's a text file of SQL statements, right? If so, you could write a script on your local machine to go through the file statement by statement and send them one at a time to mySQL.

incka
10-29-2004, 11:49 AM
I'm not a programmer. Period. I pay people to do that for me. Do you want to be paid it get wikipedia onto my server?

Westech
10-29-2004, 11:54 AM
Sorry, I don't have time to take on any additional projects right now. I was just trying to give r2 an idea on how he might around the 2 MB limit he mentioned above.

Nintendo
10-29-2004, 07:27 PM
There are a few sites where you can download the whole database in a file that's a few hundred megs and then un-zip it and edit it all as static files, not a mySQL database. (It's a little out dated, and with out the images.)

http://www.hut.fi/~tkarvine/tero-dump/
Just English, 164 megs ziped, but when un-ziped it's 1.295 Gig.

http://www.tommasoconforti.com/wiki/
English: (390 MB, from 02 Aug 2004) Though when I try to download this, it stops downloading before it's all been downloaded.

As for the content on public-domain-content.com, I downloaded each file one by one as a text file and then edited them all and downloaded the main sections as HTML files. Ziped it's just 21.8 megs, and unziped it's 49.3 megs.

s2kinteg916
10-29-2004, 08:41 PM
Copy and paste is alot of work

LuckyShima
10-30-2004, 12:12 AM
Where do you download the entire database from?

Is it from http://download.wikimedia.org ?

I wonder if it is this link:

- Database download via IPv6 -

but I have been getting a "page cannot be displayed" error all day trying this link at http://download.n6-svc.wikipedia.org/

Thanks

Xander
10-30-2004, 02:27 AM
I got the same error trying both. You do want all the languages right and not just the English section?

r2d2
10-30-2004, 05:59 AM
I downloaded the 'en.wikipedia.org' part way down the page.

The entire database? I think thats about 40Gb isnt it? I think you would be brave attempting that :)

Think I will just use the XML version. I may have a look at Nintendos zipped versions if they unzip in to separate sections..

moonshield
10-30-2004, 07:08 AM
i do copy and paste. It gives me my daily work out.

LuckyShima
10-31-2004, 01:19 AM
I have uploaded the compressed version to my account, it took 8 hours, and then untarred it, it comes out at 1.5gb.

i am going to try to read it into the database.

Does anyone know how big the MySQL database will be when it is loaded?

I am fast running out of space and I don't have enough to load the db so I will need to upgrade. How much space will I need?

Thanks.

Nintendo
11-05-2004, 07:12 PM
:::I may have a look at Nintendos zipped versions if they unzip in to separate sections.

You can unzip one section at a time, or all of them in one file. Or crawl the site and then edit it.