...

View Full Version : Making a php search engine to find posts on my blog



Andy92
06-03-2007, 11:13 PM
Hi there,

I have been thinking for a while now, i really need a search engine for my blog so that users can find posts.

I have been looking in, and basically is this a good way to do it...

Assign tags to different blog posts, and store the tags in a mysql database, then put a search box and users can put in words and then it will show the blog posts with the tags of the words they type in.

So, if one blog post containd tags such as video, funny, laugh, hillarious

And someone typed in laugh into the search it would come up with that article.

Now, how would i go about storing multiple tags into 1 mysql field, and then seperate them with commas or something, and then retrive them as indervidual words and takeaway the commas?

And also, how would i make it so that if a user types a few words, to make it so that it makes seperate words when they put a space, so 'funny vidoes' would be funny and videos.

And then that last thing would be to put something in php like SELECT * FROM blog WHERE $list_search_words_here == $article_tag_words_here.

Please could someone explain this to me and give me some help?

:confused:

ralph l mayo
06-04-2007, 12:41 AM
http://en.wikipedia.org/wiki/Database_normalization

What you're going to want to search tags is to have a table, say "tags," with either two columns: the primary key, an auto-incrementing integer, and the tag_name, a unique varchar of whatever length you feel is appropriate.

Next, create a table tag_content_rel with two columns: an integer id referencing the content table with your posts' id, primary key, and an integer id referencing the tags table's id. Your tables need to use the InnoDB storage engine to enforce foreign keys.

http://en.wikipedia.org/wiki/Foreign_key

Say for the sake of simplicity you have three posts (tags are not stored in this table):
Post id 1: title: "boatbikecar", tags are 'boating', 'bicycling', 'driving'
Post id 2: title: "boat", tags are 'boating'
post id 3: title: "boatcar", tags are 'boating', 'driving'

Therefore your tags table should look like this (the order of the id doesn't matter, but there should be no duplicates):
id 1: driving
id 2: boating
id 3: bicycling

The tag_content_rel ties them together like such:
content_id, tag_id
1, 1
1, 2
1, 3
2, 2
3, 2
3, 1

So, to get titles of posts tagged with a particular word:


SELECT content.title FROM tags
INNER JOIN tag_content_rel ON tag_id = tags.id
INNER JOIN content ON content.id = content_id
WHERE tags.name="driving";


Which follows the chain from tag name->tag id->content ids tied that that tag id->content title
and hopefully produces "boatcar" and "boatbikecar"

Tags are kind of inflexible to do real searching. Arbitrary keywords are matched with a system called FULLTEXT in MySQL that you may want to look into. Unfortunately it requires you to use the MyISAM engine, which otherwise is awful. C'est la MySQL.

Andy92
06-04-2007, 06:27 PM
Ok,

Thanks for your help and your code. I am begining to code the search engine now, but where it says...

WHERE tags.name="driving";




How can i get it to say, WHERE tags.name="keyword1"&&"keyword2";

Or even 4 keywords ?

printf
06-04-2007, 08:43 PM
Just split the keywords and add the building logic to include them in the query! This is long winded, but I wrote it that way so you can see a few different approaches to take while safely handling different input logic that you could use to allow for multiple key words and two different formats (split with a comma or a space, or a combination of both)


<?php

/* we handle key words split by a comma or a space or both */

$_POST['key_words'] = 'driving, boating, bicycling, motor boating';

if ( ! empty ( $_POST['key_words'] ) )
{
$words = array ();

/* if key words are split by a comma, do this */

if ( strpos ( $_POST['key_words'], ',' ) !== false )
{
$words = array_diff ( array_map ( 'trim', explode ( ',', $_POST['key_words'] ) ), array ( '' ) );

/* check if we have muliple key words in a single key word */

foreach ( $words AS $key => $value )
{
/* if we find one, do this */

if ( strpos ( $value, ' ' ) !== false )
{
/* remove the multi key word from the array of words */

unset ( $words[$key] );

/* remove any key words that already exist */

$words = array_unique ( array_merge ( $words, array_diff ( array_map ( 'trim', explode ( ' ', $value ) ) , array ( '' ) ) ) );
}
}
}
else if ( strpos ( $_POST['key_words'], ' ' ) !== false )
{
/* simple, just split and remove empty key words */

$words = array_diff ( array_map ( 'trim', explode ( ' ', $_POST['key_words'] ) ), array ( '' ) );
}

/* if we have words to do the search */

if ( ! empty ( $words ) )
{
/* database connect, select stuff here */

/* make the variables database safe */

$words = array_map ( 'mysql_real_escape_string', $words );

/* do the query */

$res = mysql_query ( "SELECT field1, field2 FROM table WHERE tag_name IN ( '" . join ( "', '", $words ) . "' );" );

if ( mysql_num_rows ( $res ) > 0 )
{
/* we have some results */

// do stuff
}
}
else
{
/* no valid words, error */

// do stuff
}
}

?>

But I still wouldn't do what your doing, because your database already has the content that can be searched using FULL TEXT mode with fine tuned relevance sorting, so the need to add more key words to a new database table just to add searching makes no sense at all.

Andy92
06-04-2007, 11:02 PM
Ok, this is what i have so far...


<?
$_POST['key_words'] = '1, 2, 3, 4, 5';



if ( ! empty ( $_POST['key_words'] ) )
{
$words = array ();

/* if key words are split by a comma, do this */

if ( strpos ( $_POST['key_words'], ',' ) !== false )
{
$words = array_diff ( array_map ( 'trim', explode ( ',', $_POST['key_words'] ) ), array ( '' ) );

/* check if we have muliple key words in a single key word */

foreach ( $words AS $key => $value )
{
/* if we find one, do this */

if ( strpos ( $value, ' ' ) !== false )
{
/* remove the multi key word from the array of words */

unset ( $words[$key] );

/* remove any key words that already exist */

$words = array_unique ( array_merge ( $words, array_diff ( array_map ( 'trim', explode ( ' ', $value ) ) , array ( '' ) ) ) );
}
}
}
else if ( strpos ( $_POST['key_words'], ' ' ) !== false )
{


$words = array_diff ( array_map ( 'trim', explode ( ' ', $_POST['key_words'] ) ), array ( '' ) );
}




}



mysql_connect("localhost", "username", "password") or die(mysql_error());
mysql_select_db("database") or die(mysql_error());
$result7 = mysql_query("SELECT * FROM blog INNER JOIN blog_tags_link ON tag_id = tags.id
INNER JOIN content ON content.id = content_id WHERE id IN ( '" . join ( "', '", $words ) . "' ) ORDER BY id DESC LIMIT 0 , 5");



if ($row7 = mysql_fetch_array($result7)) {
do {
printf("- %s %s <a href='http://www.kedoa.com'>%s</a><br />", $row7[month], $row7[day], $row7[title]);
}while ($row7 = mysql_fetch_array($result7));
}else{
echo "Could not retrive posts!";
}
?>

It says...

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /home/kedoacom/public_html/demo.php on line 94
Could not retrive posts!

Its basically this line thats confusing...

$result7 = mysql_query("SELECT * FROM blog INNER JOIN blog_tags_link ON tag_id = tags.id
INNER JOIN content ON content.id = content_id WHERE id IN ( '" . join ( "', '", $words ) . "' ) ORDER BY id DESC LIMIT 0 , 5");

I have renamed tag_content_rel to blog_tags_link, and the tags table to blog_tags. I dont know what is going wrong?

:confused:

Fumigator
06-04-2007, 11:17 PM
I beg you to check your query for errors.



if (!$result7) {
die("SQL ERROR! ".mysql_error());
}


Plus-- you may be interested in using a FULLTEXT index (google it for more info).

Andy92
06-05-2007, 01:52 AM
Ok,

Thanks for all of your help. I have got most of it working now.

But, say if i have this as a table layout...

id | random
1 | trainsrule
2 | carsrule
3 | planesrule

How can i get it so that if someone types in trains, it will search inside the text in the random field, and display the id 1.

And if someone searches 'planes cars' it will come up with id 2 and 3.

And if they type in rule it will come up with id 1, 2 and 3?

So it gets the words they type in, puts them like 'word1, word2, word3' then searches for those words inside the random field and displays the ids which contain them words in the random field? :confused:

Andy92
06-05-2007, 06:07 PM
Ok, got it working with...


$result8 = mysql_query("SELECT * FROM blog WHERE tags LIKE '%za%'||'%moto%' ORDER BY id DESC");

But how can i get it to say WHERE tags and title LIKE...

So it will search the tags and the title fields?

aedrin
06-05-2007, 06:13 PM
I suggest looking into FULLTEXT again.



tags LIKE '%za%'||'%moto%'

This is bad query design.

Any LIKE expression that starts with %, means it has to go through every single row in the table. For every LIKE expression. You can imagine how hard that becomes for the database.

Andy92
06-05-2007, 10:44 PM
Ok, do you know what the alternative is?

aedrin
06-05-2007, 11:05 PM
Yes, that which has been suggested twice now.


I suggest looking into FULLTEXT again.


Plus-- you may be interested in using a FULLTEXT index (google it for more info).

Andy92
06-06-2007, 12:40 AM
Ok,

I have looked into fulltext and put in...

WHERE MATCH(tags) AGAINST('each,from')

But it comes up with the error...

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /home/kedoacom/public_html/demo.php on line 59
There were no blog posts that matched your query.

??

aedrin
06-06-2007, 01:03 AM
Your problem is with your query.

Check mysql_error():



$result = mysql_query($sql) or die(mysql_error());

Andy92
06-06-2007, 04:54 PM
Wow, thanks it doesnt come up with the error message now.

But it doesnt retrive the items? I have put...


$result8 = mysql_query("SELECT * FROM blog WHERE MATCH(tags) AGAINST('zam,from') ORDER BY id DESC") or die(mysql_error());

And it comes up with...


Can't find FULLTEXT index matching the column list

I have been hasving a look at this...

http://www.databasejournal.com/features/mysql/article.php/1578331

And there is a stopword list. Why?

:thumbsup:

Pennimus
06-06-2007, 05:16 PM
Seems like you haven't added a fulltext index to that column in your database.

Either go into PHP MyAdmin and add it or construct a query to do so from your front end.

Andy92
06-07-2007, 10:58 PM
Seems like you haven't added a fulltext index to that column in your database.

I have now done this and it comes up with...

There were no blog posts that matched your query. (Which i set it to say if it couldnt find any blog posts.

But, my 2 words are zam and moto, and in the tags colmn, there are tags like zam and moto but its not displaying them?

http://www.kedoa.com/demo.php

There where i am testing it. Here is my code...



mysql_connect("localhost", "username", "password") or die(mysql_error());
mysql_select_db("database") or die(mysql_error());

$result8 = mysql_query("SELECT * FROM blog WHERE MATCH(tags) AGAINST('zam,moto') ORDER BY id DESC") or die(mysql_error());
if ($row8 = mysql_fetch_array($result8)) {
do {

printf("Tag: %s | Post Number: %s <a href='http://www.kedoa.com'>%s</a><br />", $row8[tags], $row8[id], $row8[title]);

}while ($row8 = mysql_fetch_array($result8));
}else{
echo "There were no blog posts that matched your query.";
}


Do you know what is wrong?

Andy92
06-27-2007, 08:02 PM
I really need some help with fulltext. Is there any tutorials that show you wheat you need to do step by step?

StupidRalph
06-28-2007, 12:47 AM
How are you communicating with MySQL? You have a set up a FULLTEXT index on that column. If you have a fairly lot of information in that database it may take some time for that FULLTEXT index to be created. Also, please note that your table must be MyISAM to create a FULLTEXT index. I personally do not use FULLTEXT indexes anymore.

I just googled and found this (http://www.databasejournal.com/features/mysql/article.php/1578331) article about FULLTEXT searching.

Regarding one of your earlier suggestions about storing multiple keywords in one column with some type of delimiter (e.g. comma or space). I think you should give Ralph L Mayo's post another look. Especially the part about normalization.

You also will find the article written by CodingForums' MySQL administrator Guelphdad interesting. I can't seem to praise his articles enough. He has the links to his articles in his signature. I think you will benefit the most from this (http://guelphdad.wefixtech.co.uk/sqlhelp/lists.shtml) one.

Andy92
06-28-2007, 05:16 PM
Thank you so much! :)

I have read through the tutorials, and i will have a play around with the code tonight and post back.

:)



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum