PDA

View Full Version : view outside results in own template


boeing747fp
12-14-2003, 04:05 AM
i want to set up a research center that searches about 5 outside research centers and returns their results in my own design and format.... is this possible?

dswimboy
12-15-2003, 04:33 PM
depends on what you mean by "searches", but most likely yes. do the other centers store data in a specific format? what format? are the formats different for each center? this will help people on the forum give you some ideas on how to go about fetching the info.

boeing747fp
12-18-2003, 12:39 AM
they arent all the same... some use cgi, some use php, some use jsp, and some use asp, but some are also undisclosed and just say like /search?q=blahblah
-------
encarta.com
xreferplus.com
ebsco
infotrac
worldbook.com
askjeeves
google
---and some other ones in the future, but those are the basic ones i want to start out with

dswimboy
12-18-2003, 12:50 AM
oh...i understand now. that sounds like quite a project in RegEx. here's how i'd go about doing it:
1. find out how each research page searches. most should use the GET method. what is the url for the get method? what arguments need to be passed for the search url to work?
2. save a results page for each of the research center
3. start thinking of some clever way to pull the results apart. you will need to find a uniqueness for each result in the HTML. maybe each result starts with
<div>1.<a href
just make sure it is unique, or you'll get erreneous info.
4. next you'll need to use regular expressions to parse the results. you will probably need a different method for each result page.
5. you can output the data on the fly, or store the data in variables, and output at the end.

this sounds like a pretty serious script. the hardest part will be extracting the key information from each results page. just hope that each site makes it clear, in the HTML, where the results begin, and where they end.
you may be able to start the parse after it says "RESULTS" in the html window. you might also be able to find out how many results it returned on this page "found 217, displaying 15 per page". you could then say, start parse after "RESULTS" end parse after 15 results. good luck!

boeing747fp
12-18-2003, 01:55 AM
any specific language that this would be easier to do in? im familiar more with PHP than CGI/Perl

dswimboy
12-18-2003, 02:56 AM
i'm just learning PHP, and i haven't looked into it's RegEx methods. i know Perl is wonderful with strings, but it is very complex as well. if you know PHP better, i would recommend that. if it doesn't end up working out, a lot of the code will be re-useable in Perl with minor syntax fixes.