...

View Full Version : Custom Spider/Scraper - Help!



IvanSEO
12-05-2008, 05:18 PM
Hey!

Im new over here, so a small introduction. I am from Canada, Toronto, and run a small financially focused website.

The problem - many financial insitutions publish their data online, and update it on daily basis. There are over 60 institutions, and to follow each one is very challenging. I want to create a summary page with financial data from those institutions. Release a spider once a day, get their updates, and then post them all together on the website.

Obviosuly copy&paste is off the table since it takes at least 1.5 hour to go through all lenders and get their data. The only possible solution it seems is to set up a custom spider who will crawl specific fields (div tags, table cells), extract data and compile it into one file. The question is - do you know any software that is capable of doing this? I know there are plenty of scrapers out there, but the requirement for a spider is to be able to extract data from specified table cells and in some cases div tags.

I cant go to a data extraction company since they charge too much (do they?). Please let me know if you're aware of any applications that can match those requrements.

Any help guys! Thanks!

Erni
12-05-2008, 08:32 PM
You are right - the problem can't be solved easily. To collect data from each source is not a problem, but to bring this all to one format is a unique job. I'm looking for such service.
In my mind it could be an online service which is told what pages to grab and aoutomatically cut required info form them with sending results via email for example.
May be someone already found this?

itsallkizza
12-05-2008, 09:16 PM
I could build one for your specific needs without too much problem. All this project really needs is a one-time effort of visiting each of your pages and locating exactly where the data is displayed on each page.

After that, the only maintenance required would be to check your results every so often to make sure the source website(s) haven't changed the location/method of output.

Feel free to email me if you want a faster response.

oesxyl
12-05-2008, 11:29 PM
Hey!

Im new over here, so a small introduction. I am from Canada, Toronto, and run a small financially focused website.

The problem - many financial insitutions publish their data online, and update it on daily basis. There are over 60 institutions, and to follow each one is very challenging. I want to create a summary page with financial data from those institutions. Release a spider once a day, get their updates, and then post them all together on the website.

Obviosuly copy&paste is off the table since it takes at least 1.5 hour to go through all lenders and get their data. The only possible solution it seems is to set up a custom spider who will crawl specific fields (div tags, table cells), extract data and compile it into one file. The question is - do you know any software that is capable of doing this? I know there are plenty of scrapers out there, but the requirement for a spider is to be able to extract data from specified table cells and in some cases div tags.

I cant go to a data extraction company since they charge too much (do they?). Please let me know if you're aware of any applications that can match those requrements.

Any help guys! Thanks!
can you post a link to one of the sites or something closer to what you need to extract?

regards



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum