...

View Full Version : SQL Full Text search



candycrate
09-04-2011, 06:59 AM
Hello. I have this table:



name | tags
------------+-----------------------
960 | css framework
Blueprint | css framework
jquery | javascript framework

(P.S: The table has 3 more rows with the word framework on the tags field. I didn't insert them because i was too lazy to sort the table above :p)

and execute this query:



SELECT name,tags,(MATCH (name,tags) AGAINST ('*css*' IN BOOLEAN MODE) + MATCH (name,tags) AGAINST ('*framework*' IN BOOLEAN MODE)) AS score FROM weds_scripts WHERE MATCH (name,tags) AGAINST ('*css* *framework*' IN BOOLEAN MODE) ORDER BY score


The full text will ignore the css keyword and search only the framework keyword (css is NOT in Stopwords and NO its not too short because i tried to search the keyword j and it worked) so it gave me those results:



name | tags | score
------------+------------------------+-------
960 | css framework | 1
Blueprint | css framework | 1
jquery | javascript framework | 1


Does anyone know why it ignores the keyword: css? :confused:
Thx :)

guelphdad
09-04-2011, 07:04 AM
It will ignore words below four characters in length unless you have specifically changed your my.ini/my.cnf settings to allow otherwise.

The other reason words would be ignored is if they would be found in 50% or more of the rows returned.

candycrate
09-04-2011, 07:36 AM
1)
NO its not too short because i tried to search the keyword j and it worked

2)
P.S: The table has 3 more rows with the word framework on the tags field

So every row has the word framework on the tags field (if u check the table and the p.s) that means it is found on 100% of the rows returned... But it WON'T be ignored (unlike the css keyword which is present on LESS than 50% of the rows returned)!

I already know those things...

guelphdad
09-04-2011, 07:41 AM
I double checked and the 50% matching rows criteria is not used in BOOLEAN mode, but the short words criteria is. Do you know if that value has been changed in the config file?

candycrate
09-04-2011, 07:53 AM
CREATE TABLE scripts(
id int unsigned not null primary key auto_increment,
name text,
tags text
)
ENGINE = MYISAM
CHARSET utf8;

ALTER TABLE scripts ADD FULLTEXT(name, tags);



provide actual working example that a FULLTEXT index is applied and is finding terms one character in length and with words in more than 50% of the search terms.



SELECT name,tags, (MATCH(name,tags) AGAINST ('*j*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*j*' IN BOOLEAN MODE)


j = 1 character long -> It finds terms (score = 1)

Also if u use the table data i gave u above and instead of j u use framework (which is in EVERY row) it will still give u results

SELECT name,tags, (MATCH(name,tags) AGAINST ('*framework*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*framework*' IN BOOLEAN MODE)

But if u use css as keyword it won't return anything!!!

SELECT name,tags, (MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)

candycrate
09-04-2011, 07:57 AM
About the short word criteria:
Does it apply on the keyword's length (f.e: c)
or
the term's length found (f.e css)?

candycrate
09-04-2011, 08:03 AM
Found the problem :)
The min length is applied on the TERM FOUND not the keyword (in this case css which is less than 4 characters).... I changed a record tag from css framework to csss framework and then executed


SELECT name,tags, (MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)) AS score FROM scripts WHERE MATCH(name,tags) AGAINST ('*css*' IN BOOLEAN MODE)

and it found it :)

guelphdad
09-04-2011, 09:31 PM
In a BOOLEAN search * is a special character (a wild card if you will).

using a BOOLEAN search with 'j' nothing will be found. Using it with 'j*' means any word beginning with 'j'. using it with '*j*' as you have means any word containing 'j'.

That is why it will return rows for you. Because 'javascript' and 'jquery' both contain j followed by any number of characters (which is what the * is matching).

however searching on '*css*' means any word containing css in it. So 'fixcss' or 'cssjbrd' would be returned because 'css' is contained in those made up words, but since css is a whole word in your case, and that whole word is less than four characters it won't be returned since it is smaller than the four characters required by the fulltext search.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum