Project Detail

Internet Archive (Wayback) Site Extractor  

Internet Archive (Wayback) Site Extractor is project number 199559
posted at Freelancer.com. Click here to post your own project.

 

| More Free Trial For New Buyers
 

Status:

Selected Providers: fastnet

Budget: $300-1500

Created: 11/28/2007 at 7:06 EST

Bid Count: 6

Average Bid:
N/A

12/05/2007 at 7:06 EST

Project Creator: BarneyThorn
Employer Rating: (No Feedback Yet)

Bid On This Project
 

Description

Internet Archive (Wayback) Site Extractor

We run a group of sites that make money from advertising.
We are constantly in the need for non duplicate content.
We therefore came up with the idea of extracting old versions of sites from a number of years back whose content has totally changed since then.

Here is what we need....

A tool that will extract archived sites from www.archive.org

If you do a search for lets say cars.com
It will bring up many different versions of the site from over the years.
We will manually pick the one we want...
Lets say we pick Sep 20, 2003
We will then enter into our extractor tool its url ..
In this case being http://web.archive.org/web/20030920161737/http://cars.com/

The extractor tool will then extract the entire site including all images and files and save it to a folder.

That is it.

However they have a little bit of code on every page that makes this hard to do.
You will have to find a way around the code.

Additional in the final extracted files there must be no mentioned of the archive.org site in any of the code.

We will pay using GAF escrow.

Messages Posted:0 View project clarification board Post message on project clarification board

Bid On This Project
 

If you are the project creator or one of the bidders Log In for more options

Bids are hidden by the project creator. Log in as the project creator or as one of the bidders to view bids. You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


    Bid on this Project