Results 1 to 6 of 6

Thread: Scraping info from pages

  1. #1
    Registered ZigE's Avatar
    Join Date
    Dec 2006
    Posts
    122

    Scraping info from pages

    I have to transfer about 400 unique pages, all with the same page template layout (headings etc) then stick into csv/database.

    I have 3 options;

    1) create my own php/curl script that will do it (Don't know if I'm experienced enough to do this.)
    2) Use some software (like iMacros is this possible?)
    3) Copy and paste by hand

    Now obviously I'm wanting to avoid number 3. But what I'm asking is has anyone had any experience at doing this, or have any other alternative ideas, before I jump headfirst into it

  2. #2
    Administrator Chris's Avatar
    Join Date
    Feb 2003
    Location
    East Lansing, MI USA
    Posts
    7,055
    Hire someone to copy & paste by hand. Basic data entry is an easy thing to outsource.
    Chris Beasley - My Guide to Building a Successful Website[size=1]
    Content Sites: ABCDFGHIJKLMNOP|Forums: ABCD EF|Ecommerce: Swords Knives

  3. #3
    Registered
    Join Date
    Aug 2006
    Location
    Sacramento, CA
    Posts
    208
    You would probably be better off paying for someone to write a quick script to scrap all the data than paying someone to copy and paste.

    Depending on the complexity of the template and amount of data you need to pull it should not take someone experienced with this more than an hour or so to write.
    ________
    Avandia Lawyer
    Last edited by rpanella; 03-17-2011 at 11:06 AM.

  4. #4
    Administrator Chris's Avatar
    Join Date
    Feb 2003
    Location
    East Lansing, MI USA
    Posts
    7,055
    There is actually a php function set/library that makes it really easy to scrape sites... I can't think of the name right now though. It allows you to feed in a URL and then access page elements by tag name.
    Chris Beasley - My Guide to Building a Successful Website[size=1]
    Content Sites: ABCDFGHIJKLMNOP|Forums: ABCD EF|Ecommerce: Swords Knives

  5. #5
    Registered ZigE's Avatar
    Join Date
    Dec 2006
    Posts
    122
    Just on this, I ended up hiring someone from getafreelancer. It was about 400 lines of code - it was a bit more complex than first thought. Cost me about $150.

    Definatly worth outsourcing stuff like this.
    Last edited by ZigE; 02-23-2008 at 03:43 PM.

  6. #6
    Registered
    Join Date
    Feb 2005
    Posts
    13
    Quote Originally Posted by Chris View Post
    There is actually a php function set/library that makes it really easy to scrape sites... I can't think of the name right now though. It allows you to feed in a URL and then access page elements by tag name.
    Chris, were you referring to Tidy?

    Anyway, I did find this http://htmlpurifier.org/


Similar Threads

  1. Selling links for PR4 site with 35000 pages
    By complore in forum The Marketplace
    Replies: 0
    Last Post: 08-21-2007, 07:37 AM
  2. WHM Error Pages Manager 2.01 (WHM Plugin)
    By WEBDOMAIN.com in forum The Marketplace
    Replies: 0
    Last Post: 07-16-2007, 05:40 AM
  3. Large Number of low PR pages v's fewer pages with higher PR
    By Blue Cat Buxton in forum Search Engine Optimization
    Replies: 4
    Last Post: 12-21-2004, 06:40 PM
  4. SEO paying someone to do it.
    By jr1966 in forum Search Engine Optimization
    Replies: 21
    Last Post: 09-08-2004, 06:21 AM
  5. Local Rank stuff...
    By chromate in forum Search Engine Optimization
    Replies: 41
    Last Post: 02-07-2004, 03:53 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •