PDA

View Full Version : Trouble with a Join


Keleth
09-21-2009, 05:08 PM
I was hoping I could get some help with something...

Basically, in my ever ongoing battle to learn mySQL, I'm trying to make a forum of sorts. What I've got is a table of threads, which has 3 columns (threadID, parentForumID, locked), and a table of posts (postID, threadID, title, authorID, message, datePosted, lastEdited, timesEdited). The current table of threads has 2 rows, the table of posts has 3 rows at the moment. In terms of relationships, every row in posts will have one corresponding row in threads, and every row in threads will have one or more corresponding rows in posts.

What I'd like to do is pull all the threads with the latest post information, so I can both sort by that latest post (newest threads at top), as well as displaying the information from the latest post. I came up with the following query:
SELECT threads.threadID, threads.locked, posts.title
FROM threads
LEFT JOIN (SELECT * FROM posts ORDER BY datePosted DESC) posts
ON threads.threadID = posts.threadID
WHERE threads.forumID = 1
ORDER BY posts.datePosted DESC

When I do this, though, I get all 3 post rows with the thread info attached, instead of 2 thread rows with the latest posts. I can't figure out if I'm doing the JOIN wrong or if I'm doin something else wrong.

I know I can use GROUP BY, and came up with this:
SELECT threads.threadID, threads.locked, posts.title
FROM threads
LEFT JOIN (SELECT * FROM posts ORDER BY datePosted DESC) posts
ON threads.threadID = posts.threadID
WHERE threads.forumID = 1
GROUP BY threads.threadID
ORDER BY posts.datePosted DESC

It works, but I'd like to figure out if this is the right way to do it and if my JOIN is right/wrong.

Also, I was wondering if there is a way to count the number of posts for each thread from the posts table. Right now, I'm doing all the processing by having a separate query for each thread, which means if there happen to be 20 threads, thats 20 extra queries going on, which I should obviously avoid.

Old Pedant
09-21-2009, 08:28 PM
Not sure why you think you need LEFT JOIN. You only use LEFT JOIN when you want to get records from table on left side even if there are no matching records on the right side. But in a forum, that should *NEVER* happen. Every thread, as you yourself stated, has at least one message matching it.

Anyway...


SELECT ...
FROM threads AS T, posts AS P,
( SELECT threadID, MAX(postid) AS latestPostID
FROM posts GROUP BY threadID ) AS PLast
WHERE T.threadID = P.threadID
AND P.threadID = PLast.threadID
AND P.postID = PLast.postID
ORDER BY ...whatever you want...

That assumes the postID is an autonumber field. And assumes you want the last post, not necessarily the last one that was *edited*.

If you want the last one that was edited, and *IF* your "lastEdited" field is a DATETIME field (*not* date alone!) then you could use:

SELECT ...
FROM threads AS T, posts AS P,
( SELECT threadID, MAX(lastEdited) AS latestEdit
FROM posts GROUP BY threadID ) AS PLast
WHERE T.threadID = P.threadID
AND P.threadID = PLast.threadID
AND P.lastEdited = PLast.latestEdit
ORDER BY ...whatever you want...

Old Pedant
09-21-2009, 08:32 PM
if there is a way to count the number of posts for each thread from the posts table

Yes, trivial:

SELECT threadid, COUNT(*) AS postInThisThread
FROM posts
GROUP BY threadid
ORDER BY threadid

You *could even join *that* to the other query to get the number of posts and the latest post all in a single record.


SELECT ...
FROM threads AS T, posts AS P,
( SELECT threadID, MAX(postid) AS latestPostID
FROM posts GROUP BY threadID ) AS PLast,
( SELECT threadid, COUNT(*) AS postInThisThread
FROM posts GROUP BY threadID ) AS PCount
WHERE T.threadID = P.threadID
AND P.threadID = PLast.threadID
AND P.postID = PLast.postID
AND T.threadID = PCount.threadID
ORDER BY ...whatever you want...

Keleth
09-21-2009, 08:38 PM
Ah, ok, I see what you're doing.

Yah, I used a JOIN because I thought the mentality behind a JOIN is when you have a set of records you want, and needs extra details from a related table. If that's not the case, I probably have a ton JOINs that shouldn't be JOINs.

So, and this is just to understand the JOIN issue, it is behaving properly... its picking up all the rows on the left (thread), and finding all the matching rows on the right (posts).

If you don't mind me extending the question, lets say I wanted to get the username associated with the authorID from the latest post (I don't want the latest edit). In this case, it again wouldn't be appropriate to JOIN, but in the subselect, select from posts and users where authorID = userID?

Old Pedant
09-21-2009, 08:54 PM
No, no! I *AM* using JOINs!

I just used the "shorthand" form of an INNER JOIN (I call it an "implicit join").

The following query is 100% IDENTICAL to the last one I showed:

SELECT ...
FROM threads AS T
INNER JOIN posts AS P ON T.threadID = P.threadID
INNER JOIN ( SELECT threadID, MAX(postid) AS latestPostID
FROM posts GROUP BY threadID ) AS PLast
ON ( P.threadID = PLast.threadID AND P.postID = PLast.latestPostID )
INNER JOIN ( SELECT threadid, COUNT(*) AS postInThisThread
FROM posts GROUP BY threadID ) AS PCount
ON T.threadID = PCount.threadID
ORDER BY ...whatever you want...

But don't you agree that the implicit join form is easier to read and understand??

So, to extend it to include authors is trivial:

SELECT T.threadID, T.locked, P.postID, P.title, A.authorName, P.datePosted, P.lastEdited
FROM threads AS T, posts AS P, authors AS A
( SELECT threadID, MAX(postid) AS latestPostID
FROM posts GROUP BY threadID ) AS PLast,
( SELECT threadid, COUNT(*) AS postInThisThread
FROM posts GROUP BY threadID ) AS PCount
WHERE T.threadID = P.threadID
AND A.authorID = P.authorID
AND P.threadID = PLast.threadID
AND P.postID = PLast.postID
AND T.threadID = PCount.threadID
ORDER BY T.threadID


Here, I put in some of the possible fields in the SELECT. You could of course use any others. For example, if your authors have atavars, you SELECT the IMG filename of the atavar in there.

Keleth
09-21-2009, 08:59 PM
I see... ok, so my mistake wasn't a JOIN, but a LEFT JOIN... I should have used an INNER JOIN... but using a select from multiple tables using the were the same as the JOIN is a shorthand... I get it now, thanks!