...

View Full Version : Fulltext - MATCH...AGAINST will match some words but not others...



juzzbott
05-06-2011, 12:06 AM
Hey everyone,

I am having a little trouble with full text search within MySQL, whereby some words will return results, however others will not.

Lets say I have the following table:



Row | Title | Content
---------------------------------------------------------------------------------
1 | Help topic 1 | Blah blah blahdity blah
---------------------------------------------------------------------------------
2 | Another help topic | Some other data here


My query essentially looks like this:

SELECT * FROM `table_name` WHERE (MATCH (Title) AGAINST ('search_text*' IN BOOLEAN MODE))

The issue that I am having is that if enter 'topic' as my "search_text", both rows are returned, which is to be expected. If I use 'another' then an empty result is returned.

After some reading, I found some information regarding full text that specifies words in more than 50% of the results are dropped from the set. This is not relevant in my case, as I am using boolean mode, however the table has more than 5 results, and the results are the same.

I don't understand why this is, as it meets the min query length, and stop words should be ignored as I am using the IN BOOLEAN MODE...

Could anyone shed any light as to why this is??

Cheers

Old Pedant
05-06-2011, 12:25 AM
Ummm...I think you misread the documentation.

http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html


Boolean full-text searches have these characteristics:
* They do not use the 50% threshold.
* They do not automatically sort rows in order of decreasing relevance. You can see this from the preceding query result: The row with the highest relevance is the one that contains “MySQL” twice, but it is listed last, not first.
* They can work even without a FULLTEXT index, although a search executed in this fashion would be quite slow.
* The minimum and maximum word length full-text parameters apply.
* The stopword list applies


Yes, "another" is in the stopword list.

juzzbott
05-06-2011, 12:40 AM
Ummm...I think you misread the documentation.

http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html


Yes, "another" is in the stopword list.

Hi Old Pedant,

Thanks for clarifying that for me, that would certainly explain it.

Do you perhaps know of anyway to ignore the stopword list when executing a list. After some further reading of that page, it seems that appending the wildcard character (*), should ignore stopwords anyway:

If a word is specified with the truncation operator, it is not stripped from a boolean query, even if it is too short (as determined from the ft_min_word_len setting) or a stopword. This occurs because the word is not seen as too short or a stopword, but as a prefix that must be present in the document in the form of a word that begins with the prefix.

Unless it is matching 'another' as a whole word before applying the * wildcard??

Cheers

Old Pedant
05-06-2011, 12:54 AM
Now that I don't know. But why not just try it?

juzzbott
05-06-2011, 01:06 AM
Now that I don't know. But why not just try it?

I have, which I why am having trouble understanding it.

Searching for topi* or topic* returns results, however hel*, help*, anoth* or another* does not. So it seems very flaky in the interpretation of the stopword list and the wildcard operator...

Cheers

juzzbott
05-06-2011, 01:18 AM
I found this page which lets me change the stopword file, which resolved my issue.
http://w3guru.blogspot.com/2007/05/edit-mysql-stopword-list.html

The only problem is that it's a server wide configuration option, so anything on a shared environment cannot be fixed.

While I agree with the idea behind this stopword list, and can definitely see it's uses, I think it's pretty bad that there is not a query option to ignore it.

Old Pedant
05-06-2011, 01:45 AM
While I agree with the idea behind this stopword list, and can definitely see it's uses, I think it's pretty bad that there is not a query option to ignore it.

Well, for the reason that things are as are they are, look here:
http://www.codingforums.com/showpost.php?p=1086591&postcount=10

Aren't you glad you aren't paying $20,000 per year to use MySQL? <grin/>

mickey543
07-27-2011, 11:15 AM
MySQL's full-text search capability has few user-tunable parameters. Take them in order.
But may be you just need to reinstall it .



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum