Project Detail

Words in Wikipedia  

Words in Wikipedia is project number 386597
posted at Freelancer.com. Click here to post your own project.

 

| More Free Trial For New Buyers
 

Status: Cancelled

Selected Providers: -

Budget: $30-250

Created: 02/15/2009 at 5:44 EST

Bid Count: 7

Average Bid:
N/A

02/18/2009 at 5:44 EST

Project Creator: cmbant
Employer Rating: 9.9444/109.9444/109.9444/109.9444/109.9444/109.9444/109.9444/109.9444/109.9444/109.9444/10 (36 reviews)

Bid On This Project
 

Description

Using the full latest English Wikipedia database (freely available to download), generate a frequency-ranked case-sensitive list of words used in the main pages. These should include single words and groups up to four words (hyphen or space-separated), only text (not Wiki tags), and taken from the middle of sentences (not the first word in each sentence, so all are correctly capitalized).

Provide list of all words and word groups that appear at least 10 times, and provide file containing ten complete sentences in which each word appears and name of wiki page on which it appears, e.g.

hypothesized
[page: Prion]
Prions are hypothesized to infect and propagate by refolding abnormally into a structure which is able to convert normal molecules of the protein into the abnormally structured form.
[page: Mars_Ocean_Hypothesis]
The blue region of low topography in the Martian northern hemisphere is hypothesized to be the site of a primordial ocean of liquid water.
...

I'm flexible in exactly what format the data is provided, and you can skip groups starting and ending with common stop words (a, the, etc).

Messages Posted:3 View project clarification board Post message on project clarification board

Bid On This Project
 

If you are the project creator or one of the bidders Log In for more options

Bids are hidden by the project creator. Log in as the project creator or as one of the bidders to view bids. You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


    Bid on this Project