Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    H3X
    H3X is offline
    New Coder
    Join Date
    Nov 2008
    Posts
    90
    Thanks
    30
    Thanked 0 Times in 0 Posts

    Extracting sentences that make sense

    I'm making a PHP script to go to wikipeddia, parse out all the basic paragraphs there that contain the actual information, then extract sentences from those paragraphs that are valid.

    By "valid" I mean ones that make sense when they are alone. For example, the sentence "Trees are usually brown and green" makes sense. However, usually, they don't make sense, for example "However, in the fall trees can be red and brown" doesn't make sense if you were to just see it alone.

    So, I need to make PHP process a sentence to see if it is valid. So far, I just have it so that it extracts the title of the wikipedia page, then goes through the valid paragraphs finding sentences that contain whatever the title of the page was, and I do this using preg_match_all. For example [anything][title of page][anything].

    I have that working, but now I need to make it so it only finds sentences that make sense if they were by there selves. Any ideas on how I should do this?

  • #2
    Regular Coder funnymoney's Avatar
    Join Date
    Aug 2007
    Posts
    364
    Thanks
    17
    Thanked 24 Times in 24 Posts
    Looks something like A.I. program.
    There is no possible way of making computer program understand what makes sense to you.

    Computer program is like wikipedia says
    Quote Originally Posted by wiki
    Computer programs (also software programs, or just programs) are instructions for a computer.[1] A computer requires programs to function. Moreover, a computer program does not run unless its instructions are executed by a central processor;[2] however, a program may communicate an algorithm to people without running. Computer programs are usually executable programs or the source code from which executable programs are derived (e.g., compiled).
    so..
    Basicly a set of instructions that are executed in certain row programmer wrote them.

    If you want your program to have A.I. you need to give him massively good and smart amount of instructions that when executed will look as if program it self is smart.

    Logically if you write program it is smart as you make it to be.

    It is very good that you are playing with preg_match() because it is maybe one of the most powerful thingies in computer programming so far, but don't ask your computer to answer questions because computers are not made to answer question, they are made to follow instructions you gave them to do.

    Think about those instructions when you are trying to match sentences that make sense to you.

    What is that makes sense to you in those sentences?

    Can you show computer what makes sense to you?

    If you can show computer what makes sense to you, then computer should return that sense easy..

  • Users who have thanked funnymoney for this post:

    H3X (03-15-2009)

  • #3
    H3X
    H3X is offline
    New Coder
    Join Date
    Nov 2008
    Posts
    90
    Thanks
    30
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by funnymoney View Post
    Looks something like A.I. program.
    There is no possible way of making computer program understand what makes sense to you.

    Computer program is like wikipedia says

    so..
    Basicly a set of instructions that are executed in certain row programmer wrote them.

    If you want your program to have A.I. you need to give him massively good and smart amount of instructions that when executed will look as if program it self is smart.

    Logically if you write program it is smart as you make it to be.

    It is very good that you are playing with preg_match() because it is maybe one of the most powerful thingies in computer programming so far, but don't ask your computer to answer questions because computers are not made to answer question, they are made to follow instructions you gave them to do.

    Think about those instructions when you are trying to match sentences that make sense to you.

    What is that makes sense to you in those sentences?

    Can you show computer what makes sense to you?

    If you can show computer what makes sense to you, then computer should return that sense easy..
    Yeah, I am thinking about going through the sentences with preg_match and just removing any sentences that for example start with "however" or "also" or anything of the sort. Does anyone know if there is a PHP function or script to do this or something similar?

  • #4
    Regular Coder funnymoney's Avatar
    Join Date
    Aug 2007
    Posts
    364
    Thanks
    17
    Thanked 24 Times in 24 Posts
    i think maybe best thing you could do for start, is to get all sentences on that page

  • #5
    H3X
    H3X is offline
    New Coder
    Join Date
    Nov 2008
    Posts
    90
    Thanks
    30
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by funnymoney View Post
    i think maybe best thing you could do for start, is to get all sentences on that page
    I have already done that, and found ones that contain the correct topic. Now, I just need to find the ones that make sense as a standalone sentence.

  • #6
    Regular Coder funnymoney's Avatar
    Join Date
    Aug 2007
    Posts
    364
    Thanks
    17
    Thanked 24 Times in 24 Posts
    well what's the problem..

    you can probably write some kind of array with "nonsense" values and check weather your sentences have some of those values, and if they don't just drop them...

    PHP Code:
    $nosensesentences = array("My lucky sentence""My unlucky sentence");
    $nonsense = array("/lucky/""/poker/");

    $sensesentences = array();

    foreach (
    $nosensesentences as $key=>$sentence) {
     if (!
    preg_match($nonsense$sentence)) {
     
    $sensesentences[$key] = $sentence;
    }

    Or something like that

  • #7
    H3X
    H3X is offline
    New Coder
    Join Date
    Nov 2008
    Posts
    90
    Thanks
    30
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by funnymoney View Post
    well what's the problem..

    you can probably write some kind of array with "nonsense" values and check weather your sentences have some of those values, and if they don't just drop them...

    PHP Code:
    $nosensesentences = array("My lucky sentence""My unlucky sentence");
    $nonsense = array("/lucky/""/poker/");

    $sensesentences = array();

    foreach (
    $nosensesentences as $key=>$sentence) {
     if (!
    preg_match($nonsense$sentence)) {
     
    $sensesentences[$key] = $sentence;
    }

    Or something like that
    Yeah, I was going to do something like that, thanks.
    But does anyone know if there is a function or something similar that already does this?


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •