Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 2 of 2
  1. #1
    New to the CF scene
    Join Date
    Sep 2010
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question Aggregator Project

    It could be a project well beyond my skills right now but I've got around one full month to spend on it so I think I can do it. What I want to build is this: Gather news about a specific subject from various sources. Easy, right? Just get the rss feeds and display them on a page. Well, I want something more advanced: Duplicates removed and customized presentation (that is, be able to define/change the format in which the news headlines are displayed).

    I've played a bit with Yahoo Pipes and some other tools and I am facing two big problems:

    1. Some sources don't provide rss feeds. How do I create one?
    2. What's the best method to find and remove duplicates. I thought about comparing the headlines and checking if there is a matching bigger than, say, 50%. Is that a good practice though?

    Please add any other things (problems, suggestions, whatever) I might not have considered.

  • #2
    Regular Coder grumpy's Avatar
    Join Date
    Oct 2006
    Location
    Visible light spectrum
    Posts
    121
    Thanks
    5
    Thanked 6 Times in 6 Posts
    No RSS - http://www.masternewmedia.org/news/2...eate_a_rss.htm
    http://profy.com/2007/09/30/7-tools-...f-any-website/

    duplicates -
    Compare headlines AND authors

    Other suggestions -
    Use a search engine, mate


  •  

    Tags for this Thread

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •