...

View Full Version : Extracting sentences that make sense



H3X
03-15-2009, 09:48 PM
I'm making a PHP script to go to wikipeddia, parse out all the basic paragraphs there that contain the actual information, then extract sentences from those paragraphs that are valid.

By "valid" I mean ones that make sense when they are alone. For example, the sentence "Trees are usually brown and green" makes sense. However, usually, they don't make sense, for example "However, in the fall trees can be red and brown" doesn't make sense if you were to just see it alone.

So, I need to make PHP process a sentence to see if it is valid. So far, I just have it so that it extracts the title of the wikipedia page, then goes through the valid paragraphs finding sentences that contain whatever the title of the page was, and I do this using preg_match_all. For example [anything][title of page][anything].

I have that working, but now I need to make it so it only finds sentences that make sense if they were by there selves. Any ideas on how I should do this?

funnymoney
03-15-2009, 10:29 PM
Looks something like A.I. program.
There is no possible way of making computer program understand what makes sense to you.

Computer program is like wikipedia says

Computer programs (also software programs, or just programs) are instructions for a computer.[1] A computer requires programs to function. Moreover, a computer program does not run unless its instructions are executed by a central processor;[2] however, a program may communicate an algorithm to people without running. Computer programs are usually executable programs or the source code from which executable programs are derived (e.g., compiled).
so..
Basicly a set of instructions that are executed in certain row programmer wrote them.

If you want your program to have A.I. you need to give him massively good and smart amount of instructions that when executed will look as if program it self is smart.

Logically if you write program it is smart as you make it to be.

It is very good that you are playing with preg_match() because it is maybe one of the most powerful thingies in computer programming so far, but don't ask your computer to answer questions because computers are not made to answer question, they are made to follow instructions you gave them to do.

Think about those instructions when you are trying to match sentences that make sense to you.

What is that makes sense to you in those sentences?

Can you show computer what makes sense to you?

If you can show computer what makes sense to you, then computer should return that sense easy..

H3X
03-15-2009, 10:34 PM
Looks something like A.I. program.
There is no possible way of making computer program understand what makes sense to you.

Computer program is like wikipedia says

so..
Basicly a set of instructions that are executed in certain row programmer wrote them.

If you want your program to have A.I. you need to give him massively good and smart amount of instructions that when executed will look as if program it self is smart.

Logically if you write program it is smart as you make it to be.

It is very good that you are playing with preg_match() because it is maybe one of the most powerful thingies in computer programming so far, but don't ask your computer to answer questions because computers are not made to answer question, they are made to follow instructions you gave them to do.

Think about those instructions when you are trying to match sentences that make sense to you.

What is that makes sense to you in those sentences?

Can you show computer what makes sense to you?

If you can show computer what makes sense to you, then computer should return that sense easy..
Yeah, I am thinking about going through the sentences with preg_match and just removing any sentences that for example start with "however" or "also" or anything of the sort. Does anyone know if there is a PHP function or script to do this or something similar?

funnymoney
03-15-2009, 10:38 PM
i think maybe best thing you could do for start, is to get all sentences on that page

H3X
03-15-2009, 10:40 PM
i think maybe best thing you could do for start, is to get all sentences on that page
I have already done that, and found ones that contain the correct topic. Now, I just need to find the ones that make sense as a standalone sentence.

funnymoney
03-15-2009, 10:47 PM
well what's the problem..

you can probably write some kind of array with "nonsense" values and check weather your sentences have some of those values, and if they don't just drop them... :)


$nosensesentences = array("My lucky sentence", "My unlucky sentence");
$nonsense = array("/lucky/", "/poker/");

$sensesentences = array();

foreach ($nosensesentences as $key=>$sentence) {
if (!preg_match($nonsense, $sentence)) {
$sensesentences[$key] = $sentence;
}
}

Or something like that :)

H3X
03-15-2009, 11:21 PM
well what's the problem..

you can probably write some kind of array with "nonsense" values and check weather your sentences have some of those values, and if they don't just drop them... :)


$nosensesentences = array("My lucky sentence", "My unlucky sentence");
$nonsense = array("/lucky/", "/poker/");

$sensesentences = array();

foreach ($nosensesentences as $key=>$sentence) {
if (!preg_match($nonsense, $sentence)) {
$sensesentences[$key] = $sentence;
}
}

Or something like that :)
Yeah, I was going to do something like that, thanks.
But does anyone know if there is a function or something similar that already does this?



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum