View Full Version : Filtering words in search

01-03-2007, 04:53 AM
Okay, so I have a search script that searches my database using mysql's "LIKE"... I want to filter out words such as "a, the, is, as, I" I can't think of them all off the top of my head and was wondering if anyone had a list laying around somewhere?

01-03-2007, 08:11 PM
str_replace ('badword', 'XXXXXXX', $string);


ralph l mayo
01-03-2007, 09:36 PM
Not directly an answer to the question at hand, but searching with LIKE does not scale, and you should at least take a look at fulltext indexing. Not only is it faster, but it provides relevancy data about matches and it handles the type of stuff you're talking about (stopwords) automatically by ignoring words that appear too often in the result set.


edit: from swish-e:

a above according across actually adj after
afterwards again against all almost alone along
already also although always among amongst an and
another any anyhow anyone anything anywhere are aren
aren't around as at be became because become becomes
becoming been before beforehand begin beginning behind
being below beside besides between beyond billion both
but by can can't cannot caption co could couldn
couldn't did didn didn't do does doesn doesn't don
don't down during each eg eight eighty either else
elsewhere end ending enough etc even ever every
everyone everything everywhere except few fifty first
five for former formerly forty found four from
further had has hasn hasn't have haven haven't
he hence her here hereafter hereby herein hereupon
hers herself him himself his how however hundred
ie i.e. if in inc inc. indeed instead into is
isn isn't it its itself last later latter latterly
least less let like likely ll ltd made make
makes many maybe me meantime meanwhile might million
miss more moreover most mostly mr mrs much must
my myself namely neither never nevertheless next nine
ninety no nobody none nonetheless noone nor not
nothing now nowhere of off often on once one
only onto or others otherwise our ours
ourselves out over overall own per perhaps rather
re recent recently same seem seemed seeming seems
seven seventy several she should shouldn shouldn't
since six sixty so some somehow someone something
sometime sometimes somewhere still stop such taking
ten than that the their them themselves then
thence there thereafter thereby therefore therein
thereupon these they thirty this those though
thousand three through throughout thru thus to
together too toward towards trillion twenty two under
unless unlike unlikely until up upon us used using
ve very via was wasn we we well were weren
weren't what whatever when whence whenever where
whereafter whereas whereby wherein whereupon wherever
whether which while whither who whoever whole whom
whomever whose why will with within without won
would wouldn wouldn't yes yet you your yours
yourself yourselves