Data Mining From Digg (by script)
Data Mining From Digg (by script) is project number 324462 posted at GetAFreelancer.com. Click here to post your own project.
| Status: |
Closed
(Selected Service Provider)
|
| Selected Providers: |
selected service provider
andreiandrei
|
| Budget: |
$30-250
|
| Created: |
10/04/2008 at 17:37 EDT |
| Bidding Ends: |
10/07/2008 at 17:37 EDT
|
| Project Creator: |
yoavf
Buyer Rating:           (4 reviews)
|
| Description: |
Data Mining Task from Digg
You'll be supplied with a list of movie titles. Your task is to gather the following data
- List of Digg Submissions related to the movie based on the following search terms:
a. search for ||movie_name movie|| b. search for ||movie_name film|| c. search for ||movie_name trailer|| d. search for ||movie_name watch|| e. search for ||movie_name see||
IE for the movie "The Eye" you will run the following separates searches "The Eye movie" "The Eye film" "The Eye Trailer" "The Eye watch" "The Eye see" All *without* the double quotes!!
All searches should be combined and duplicates deleted (delete only exact duplicates, that leads to the same digg submission, not the same external URL)!
Digg search settings: "Title, Description, and URL", "All Stories". "Including burried: NO"
The results should be saved in a table (preferably excel, CSV is also possible) with the following data
ID (auto increment Serial Number), Date Submitted (dd/mm/yyyy), Title, Full URL of DIGG Item, FULL URL ITEM IS LINKED TO, number of diggs, number of comments, Made Popular(YES/NO)
Please note that the date appears on digg as a relative date (ie 2 years 34 days ago). This should of course be converted to the exact data).
Made Popular: Regular diggs (not popular) shows the following text on search result: "username" submitted "342 days ago" Popular items shows the following text instead: "Username" made popular "342 days ago"
Sample data attached. Please make sure you understand the requirements before posting your bid.
I expect this to be done, as accurately as possible by script (automatically) and in 2-3 days.
|
| Job Type: |
- Data Processing
- Market Research
- Python
- Ruby/Ruby on Rails
|
| Database: |
(None)
|
| Operating system: |
(None)
|
| Bid count: |
17
|
| Average bid: |
N/A
|
|

|