Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 2 of 2
Thread: Aggregator Project
09-10-2010, 03:36 PM #1
- Join Date
- Sep 2010
- Thanked 0 Times in 0 Posts
It could be a project well beyond my skills right now but I've got around one full month to spend on it so I think I can do it. What I want to build is this: Gather news about a specific subject from various sources. Easy, right? Just get the rss feeds and display them on a page. Well, I want something more advanced: Duplicates removed and customized presentation (that is, be able to define/change the format in which the news headlines are displayed).
I've played a bit with Yahoo Pipes and some other tools and I am facing two big problems:
1. Some sources don't provide rss feeds. How do I create one?
2. What's the best method to find and remove duplicates. I thought about comparing the headlines and checking if there is a matching bigger than, say, 50%. Is that a good practice though?
Please add any other things (problems, suggestions, whatever) I might not have considered.
09-10-2010, 09:39 PM #2
- Join Date
- Oct 2006
- Visible light spectrum
- Thanked 6 Times in 6 Posts
No RSS - http://www.masternewmedia.org/news/2...eate_a_rss.htm
Compare headlines AND authors
Other suggestions -
Use a search engine, mate