Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post

    Advice Needed: Best Way To Compare Strings

    I am trying to compare strings.

    I have a news item with a Headline and a ~20 word description of the news item.

    I pull key words out of the headline and the description and put them in an array.

    I then compare the list of key words to a large database of other news items.

    I then post the related news items under the initial news article as "Related News Item" links.

    ---

    What I want to do is have an AJAX slider that slides the relevance to increase/decrease the signal to noise ratio.

    So if you slide the bar to the right (I already have the slider made .. I just need a way to code the server side PHP) only a few but more accurate related news items are posted, if you slide the bar all the way to the left then many, many related news items are posted but they aren't as accurate.

    So I need a way to compare strings with accuracy in mind.


    The only way I can think of doing this is:

    Code:
    $compareStrings = similar_text($headlineKeywords, $relatedHeadline, $percentageThreshold );
    where $percentageThreshold changes based on the slider value.

    Does anyone else know of another way to do this?

  • #2
    New Coder
    Join Date
    Feb 2007
    Location
    NM. USA
    Posts
    10
    Thanks
    1
    Thanked 0 Times in 0 Posts
    A very interesting concept. I think you have the right idea. I have done things in the past using a threshold concept. I tried to find "What percentage of the words are the same"

    In addition you should look at weighting words so that Iraq carries more weight than news/story/the/etc.

    Finally I would stem the words so that you can pickup related stories that spelling things a little differently.

    I am interested also in others response as this relates very well to my image search engine.


    _________________________
    "Insanity is hereditary - you get it from your children." Sam Levenson
    Web Development Company Projects (Stock Photo Search Engine Learn how to sell your photos)

  • #3
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post
    [QUOTE=mwookie;595925]look at weighting words so that Iraq carries more weight than news/story/the/etc.[/url].

    How does one "weight" words ... is there a PHP library that does this ?

  • #4
    New Coder
    Join Date
    Feb 2007
    Location
    NM. USA
    Posts
    10
    Thanks
    1
    Thanked 0 Times in 0 Posts
    [QUOTE=Bobafart;596198]
    Quote Originally Posted by mwookie View Post
    look at weighting words so that Iraq carries more weight than news/story/the/etc.[/url].

    How does one "weight" words ... is there a PHP library that does this ?

    Not that I know of. I weight things based on two things:

    1) Based on their popularity (words that show up in most headlines don't mean anything)
    2) A table that assigns weights. This is much harder because you have to look at al the possible words, but I believe its more accuate. If a word is not in the list, just leave it at a default weight.

    Hope this helps


    _________________________
    "Insanity is hereditary - you get it from your children." Sam Levenson
    Web Development Company Projects (Stock Photo Search Engine Learn how to sell your photos)


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •