Project Detail

output tab delimited and titles  

output tab delimited and titles is project number 240018
posted at Freelancer.com. Click here to post your own project.

 

| More Free Trial For New Buyers
 

Status:

Selected Providers: ke888le

Budget: $30-250

Created: 03/20/2008 at 22:20 EDT

Bid Count: 1

Average Bid:
N/A

03/21/2008 at 22:20 EDT

Project Creator: willie108
Employer Rating: 10/1010/1010/1010/1010/1010/1010/1010/1010/1010/10 (5 reviews)

Bid On This Project
 

Description

The main file is called searchengineWithTitleTable.py

I am trying to get it to create a new table with two fields: urlid and title. The name of this new table should be titles. I then want to output (this is the part that you worked on before) the frequencies for each URL. However instead of the url being in the first column of the output, I want to have the title being the first column of the output. Also, I want the output to be a tab delimited file (before you created a csv file).
I added lines 528 and 424 to try to get the table written but it is not working properly.

I am running the program by running the lines:

#!/usr/bin/python

from searchengineWithTitleTable import *


pagelist=['http://blogsearch.google.com/blogsearch?as_q=&num=5&hl=en&ctz=-540&c2coff=1&as_epq=&as_eq=&as_drrb=q&as_qdr=a&as_mind=1&as_minm=1&as_miny=2000&as_maxd=17&as_maxm=3&as_maxy=2008&lr=lang_en&safe=active&ie=UTF-8&q=backache+OR+backpain&ui=blg&sa=N&start=800' ]


webcrawler = crawler("test.db")
webcrawler.createindextables()
webcrawler.crawl(pagelist)


Can you get this done for $25? I am sending the file callled searchengineWithTitleTable with this mail.
Thanks.
PS. I think there is a dependency on something called nn. You can remove references to that because that part has nothing to do with what I am doing. It is a neural network I think and is used for analysing the results.

two small addtions?

One is to make a "switch" in the output file as to whether to include
the field that contains the title and whether to include the field
with the URL in the output file and that both of these fields have
their field names at the top of their respective columns. So the
column with the urls will be called webaddress and the column with the
title will be called Titles and there is a switch in the output file
that will allow me to write one, both, or none of these to the output
file.

Secondly, that there is a switch in the output file that allows me to
choose to between outputting a csv file or a tab delimited file.

Messages Posted:0 View project clarification board Post message on project clarification board

Bid On This Project
 

If you are the project creator or one of the bidders Log In for more options

Bids are hidden by the project creator. Log in as the project creator or as one of the bidders to view bids. You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


    Bid on this Project