Hello and welcome to our community! Is this your first visit?
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 2 of 2
  1. #1
    New to the CF scene
    Join Date
    Sep 2010
    Thanked 0 Times in 0 Posts

    Question Aggregator Project

    It could be a project well beyond my skills right now but I've got around one full month to spend on it so I think I can do it. What I want to build is this: Gather news about a specific subject from various sources. Easy, right? Just get the rss feeds and display them on a page. Well, I want something more advanced: Duplicates removed and customized presentation (that is, be able to define/change the format in which the news headlines are displayed).

    I've played a bit with Yahoo Pipes and some other tools and I am facing two big problems:

    1. Some sources don't provide rss feeds. How do I create one?
    2. What's the best method to find and remove duplicates. I thought about comparing the headlines and checking if there is a matching bigger than, say, 50%. Is that a good practice though?

    Please add any other things (problems, suggestions, whatever) I might not have considered.

  2. #2
    Regular Coder grumpy's Avatar
    Join Date
    Oct 2006
    Visible light spectrum
    Thanked 6 Times in 6 Posts
    No RSS - http://www.masternewmedia.org/news/2...eate_a_rss.htm

    duplicates -
    Compare headlines AND authors

    Other suggestions -
    Use a search engine, mate


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts