View Full Version : I need help with forming a mysql join query (with grouping).
jarrodcf
05-05-2006, 09:45 PM
Here goes it:
Lets say I have two tables, 1 named Genres and 1 named Movies.
Genres >>> genre_id, genre
Movies >>> movie_id, movie_name, genre_id, mpaa_rating
When doing a search on movies, I want users to be able to filter out the results by genre. To be more specific, I want all the genres to be listed (as a link) on the left side of the page, with their corresponding counts next to them.
Example:
Action/Adventure [15]
Comedy[7]
Documentary [3]
....etc, where the number is equal to the number of movies in the database that have a matching genre.
I've tried doing this...
$query = "SELECT genre, count(movie_name) as numrows FROM Genres LEFT OUTER JOIN Movies ON Genres.genre_id LIKE CONCAT(Movies.genre_id, "%") GROUP BY genre HAVING numrows > 0 ORDER BY genre";
$result = mysql_query($query) or flush();
but having the "LIKE" commend doesn't work in this query.
I also tried this...
<?
$query = "SELECT genres, count(movie_name) as numrows FROM Genres LEFT OUTER JOIN Movies ON Genres.genre_id = Movies.genre_id GROUP BY genre HAVING numrows > 0 ORDER BY genre";
$result = mysql_query($query) or flush();
while($row = mysql_fetch_array($result))
{
$Genre = $row['genre'];
$Numrows = $row['numrows'];
?>
<div>
<b><?=$Genre;?></b> [<?=$Numrows;?>]
</div>
<?
}
?>
this kindof gives me what I'm looking for except for the fact that some, or most, movies fit into more than 1 genre (which is why I was trying to use the "LIKE" command at first). Also, how do I add into the equation that I only want to countup the movies where the mpaa_rating is whatever (R, PG, etc)? - do i also add that to the "Having" clause?
Thanks in advance for your help, and let me know if I'm ignorantly using the LEFT OUTER JOIN (as opposed to just a "LEFT JOIN" or "OUTER JOIN" - i'm quite new to this.)
guelphdad
05-06-2006, 02:41 AM
a left join, outer join and left outer join are exactly the same thing.
outer is an optional word. outer join would be performed as a left join IIRC if you name your first (or left) table as the first item in the join condition.
You don't need a left join though because you are testing for number of rows > 0. therefore you can use an inner join.
Your genre_id should be removed from your movies table. you should have a link table containing movie_id and genre_id with a row for each variation. If for example you file a movie under horror and suspense you would have two rows in that link table.
jarrodcf
05-06-2006, 03:37 AM
ahhh..... i see. and to count only the movies that match a certain mpaa rating as well? do i put that in the HAVING clause or should there be a WHERE clause somewhere?
and thanks, guelphdad.
guelphdad
05-06-2006, 04:05 AM
when you give a column an alias, you can't refer to it in a where clause, so it must go in the having clause, otherwise you can refer to it in the where clause.
in fact sometimes it doesn't belong in the where clause but in the on clause of your join.
OH and a question for you. Say a movie falls under three genres, in your count do you want it to show up three times once each for action, suspense and horror for instance?
jarrodcf
05-06-2006, 04:13 AM
yes. that is exactly how i want it to work. I actually have another couple of tables with a similar setup for Restaurants and Cuisines. (I'm building a local cityguide for a small town).
i also tried this code to include a WHERE clause in order to count only the Movies that match a certain mpaa rating (eg. the user had already filtered the list by mpaa_rating and then wanted to further filter that list of results by Genre.
$query = "SELECT genres, count(movie_name) as numrows FROM Genres LEFT OUTER JOIN Movies ON Genres.genre_id = Movies.genre_id WHERE Movies.mpaa_rating = 'PG-13' GROUP BY genre HAVING numrows > 0 ORDER BY genre";
but..if there is a better (or faster, although your solution is very fast) way to do any of this (especially as the list of Movies grows over time), please let me know.
thanks
If movies can appear in more than one genre, then you really want another table containing movie_id and genre_id, having multiple ids in one column defeats the point of using a database...
guelphdad
05-06-2006, 03:20 PM
A couple of points for you. First, when you write a query, it is easier to break it up after each item below one another as I have done below. Much easier to see and read that way from people you are asking for help.
$query = "SELECT
genres,
count(movie_name) as numrows
FROM Genres
LEFT OUTER JOIN
Movies
ON Genres.genre_id = Movies.genre_id
WHERE Movies.mpaa_rating = 'PG-13'
GROUP BY genre
HAVING numrows > 0
ORDER BY genre";
Secondly, as I mentioned to you above. You don't need a left join. A left join is used when you are searching for unmatched rows between two tables. you aren't doing this. all of your movies have a genre represented in the genre table.
Thus change your LEFT OUTER JOIN to INNER JOIN. You can then remove your HAVING clause because it would always be true.
So you want this:
$query = "SELECT
genres,
count(movie_name) as numrows
FROM Genres
INNER JOIN
Movies
ON Genres.genre_id = Movies.genre_id
AND Movies.mpaa_rating = 'PG-13'
GROUP BY genre
ORDER BY genre";
also to speed up the query it might benefit you to have an index on genre_id in both tables.
But to speed your your query up the best, you should remove your redundant information from your movies table as has been suggested to you twice. the genre information should be removed entirely from your movies table and create a third table that would list the genre(s) for each movie by genreid and the movie id with it.
You would benefit for doing the same on any other table you have redundant information in.
Take the time to read and most importantly, understand, this (http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html) article on database normalization. It will benefit you now and in the future.
jarrodcf
05-06-2006, 07:00 PM
awesome!!!! that tutorial on normalization is exactly what i needed. the 3rd link under your reply is also very helpful, guelphdad.
and thanks to both of you, I finally understand what you (guelphdad) stated in your original reply - about adding another table with movie_id & genre_id.
thanks guys, i really appreciate it.
also, since i'm new to codingforums, if i have another similar question at some point, is it standard procedure to continue this thread, add a new thread, or add a new thread and send you guys a private message since you guys seem to know a lot?
- jarrod
Just add a new thread if it's a new question, and (speaking for myself at least) I'll find it and reply if I can- I tend not to notice PMs :)
guelphdad
05-06-2006, 08:34 PM
In general practice it isn't good protocol to send a private message to get something answered. Posting in the forum, and getting good replies, builds up a good repository of information for others looking for help on the same sort of thing.
On limited occassion I might PM someone saying that I have posted a thread that I need help with. Some folks still frown on that though.
As for questions and new threads. If your question is related to something you have already started then it is still best to continue in the same thread. That way you can take advantage of information already in the thread, and again, it becomes useful for someone else that may come across it with a similar problem.
you may though start a new thread when you are asking about the same database or tables but are asking a different type of question.
Perhaps you want to know about foreign keys and indexing. You could start a new thread (if you can't find a similar thread first of course) but still refer to your movies database.
Good luck with your learning. Remember at some point, you will be the one with knowledge able to share it with others just getting started out. :thumbsup:
jarrodcf
05-07-2006, 04:58 AM
thanks again guys.
vBulletin® v3.8.2, Copyright ©2000-2009, Jelsoft Enterprises Ltd.