SELECT t.forumID, rdt.userID, r.lastPostID, rdt.lastRead, rdf.cLastRead, IF(r.lastPostID > IFNULL(rdt.lastRead, 0) AND r.lastPostID > rdf.cLastRead, 1, 0) newPosts
FROM threads t
INNER JOIN threads_relPosts r USING (threadID)
LEFT JOIN forums_readData_threads rdt ON t.threadID = rdt.threadID AND rdt.userID = 2
LEFT JOIN forums_readData_forums_c rdf on t.forumID = rdf.forumID AND rdf.userID = 2
As you can see, I do a test in the SELECT to get back if there is a new post in the thread. What I want to do is to get back if the forum has a new post. My first thought is GROUP BY, but that won't work, because it won't give me back the correct results in the IF. My only other thought is to make it a subquery with a group on the main query... the issue there is this is already going to be a subquery within a larger, relatively complex query, and I am under the impression having queries within queries within queries etc is a bad thing.
Is there a way for me to group or aggregate the data? Should I just just take the rows where newPosts = 1 and sort out the forum parts in PHP?
I also think I've figured out how I can make this a view without the userID references, but it came up in a thread I made just a little bit ago that making a view wouldn't help efficiency.
Well, this query is structured to pick only for one user. The PHP code provides the userID in both the ON statements. I just mentioned the view because I could make a view not restricted to a single user, then when I join that view onto the main query, I could limit by user. However, you mentioned in another thread that making a query a view doesn't improve efficiency, so it doesn't resolve my main concern.
The way this query is structured, it gives back all the threads in all the forums (given its based on the `threads` table, with everything left joined), and the IF gives a 1 if both the last post in a thread is greater then the last post a user read in a thread (the rdt table tracks this) and if the last post is greater then the last time a user marked an entire forum read (rdf tracks this). So when the lastest post made is new, gives back a 1. The table rdt MAY contain a matching value (which is why I check IFNULL) while rdf will always give a matching value. In fact, I switched the order of those two joins to make sure rdf takes precedence.
So since I get back all the threads, its doesn't directly tell me if there is a new post in a forum. If there are 3 threads in a given forum and 1 of them has a new post, I'd have 2 rows with a 0 and 1 with a 1, however, the forum as a whole has a new post within it. So if possible, I'd like to group/do an action to get back that the forum has a 1 somewhere. Because a forum could potentially have hundreds of threads, it doesn't make sense to pull everything. Now that I type this, I'm thinking of adding HAVING newPosts = 1 and was always going to add a WHERE clause to only get back values for the forums of interest to me. However, I'd still be getting back multiple rows per forum, which means I'd have to process it in PHP. That's not a big deal, I was just hoping to see if there was another method to do it. And as I mentioned, I could make it a subquery, but as this is going into yet another query, I worry about making too many subqueries within subqueries and am hoping to improve efficiency.
Actually, thinking a bit more, I realize I'm going to need this query in multiple locations, so it does make sense to make it a view. I've updated the code slightly to make it work independent of user (which can then be searched on as mentioned above).
SELECT rdf.userID, t.forumID, t.threadID, r.lastPostID, rdt.lastRead, rdf.cLastRead, IF(r.lastPostID > IFNULL(rdt.lastRead, 0) AND r.lastPostID > rdf.cLastRead, 1, 0) newPosts
FROM threads t
INNER JOIN threads_relPosts r USING (threadID)
LEFT JOIN forums_readData_forums_c rdf on t.forumID = rdf.forumID
LEFT JOIN forums_readData_threads rdt ON t.threadID = rdt.threadID AND rdf.userID = rdt.userID
So then maybe I should just do a group on the resulting view, taking the max value of newPosts. However, I'd be open to a more efficient suggestion :-/
Hm... I figured out that I should/how to run tests between the query and the view of the query. Adding in values to return about 2400 rows, I found that the query is averaging .0035 seconds, while the view is averaging .0085 seconds to run. At about 12000 returned rows, its .01 vs .03. Its worth noting, I put HAVING newPost = 1 into the query/view. All tests were run on localhost, so a web server I'm certain would be longer.
Obviously that's a REALLY big difference in run time, and it worries me that its that bad. The more I search, the less useful views seem. Tripping query time is a REALLY big deal.
I'm wondering if I have any alternate other then typing the whole query into another query? As far as I can tell, subqueries don't add a horrendous amount of time to a query, but I'm just doing simple tests right now... if I stored the views I wrote into PHP variables, then added them to w/e queries I write, does that seem like a reasonable way to save time? Would doing that globally hurt more then help? Could definitely use advice on this...
I was under the impression that stored procedures could only return single rows? Maybe I just don't understand them enough, but I'm not quite sure how to make use of them here? Is this a typical use of stored procedures?
Stored procedures can even return multiple recordsets, if you are working with a language (e.g., Java or ASP or ASP.NET) that can understand them. Dunno where you got the impression they could only return a single row. Totally false.
Stored procedures have two huge advantages:
(1) You can create a MySQL user account where the user *ONLY* has access to stored procedures. So effectively the user can only do what you specifically pre-program for him/her.
(2) Stored procedures really do support a programming language. WHILE loops, IFs, assignment to variables, reading records one at a time, and so forth. Pretty much anything you can do in PHP code you can do in a stored procedure (though I'll readily admit that the record-at-a-time stuff is harder to code and understand than PHP or ASP or JSP, etc.).
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
It's clear that you have met some trouble in this problem, why not try esProc to solve it?
esProc is a tool for mass data computation, especially fit for the complex data computation like yours.
Check below for more details A Query Language Over-perform SQL