Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 04-28-2009, 07:33 PM   PM User | #1
sferson
New to the CF scene

 
Join Date: Apr 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
sferson is an unknown quantity at this point
Finding two most recent records for each ID

I have a table with test scores in it. Let's say it's designed something like this:
SubID - VisDate - Score1 - Score2 - Score3 - Score 4

There are often multiple records for each SubID. I want to know how to pull out the most recent two records (based on VisDate) for each unique SubID.

Can someone help?
sferson is offline   Reply With Quote
Old 04-28-2009, 08:47 PM   PM User | #2
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,227
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
Well, one way to do it:
Code:
SELECT MAIN.*
FROM table AS MAIN,
    ( SELECT subid, MAX(visdate) AS oneDate
      FROM table 
      GROUP BY subid
            UNION
      SELECT T.subid, MAX(T.visdate) AS oneDate
      FROM table AS T
           ( SELECT subid, MAX(visdate) AS topDate
             FROM table
             GROUP BY subid ) AS X
      WHERE T.subid = X.subid AND T.visdate < X.topDate
    ) AS U
WHERE MAIN.subid = U.subid AND MAIN.visdate = U.oneDate
ORDER BY MAIN.subid, MAIN.visdate DESC
You see it? The UNION gets the latest date for each subid unioned with the next latest date for each subid. So then we JOIN all that back to the main table to get the rest of the info.
Old Pedant is online now   Reply With Quote
Old 04-28-2009, 10:35 PM   PM User | #3
Fumigator
UE Antagonizer


 
Fumigator's Avatar
 
Join Date: Dec 2005
Location: Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
Posts: 7,686
Thanks: 42
Thanked 637 Times in 625 Posts
Fumigator is a glorious beacon of lightFumigator is a glorious beacon of lightFumigator is a glorious beacon of lightFumigator is a glorious beacon of lightFumigator is a glorious beacon of light
You can GROUP BY SubID and order by VisDate Desc. You'll want to do this in a subquery so you can select everything from the table for each row.

Code:
SELECT *
FROM your_table
WHERE SubID IN (SELECT
    FROM your_table as subQ
    GROUP BY subQ.SubID
    ORDER BY subQ.VisDate desc
    LIMIT 2)
Keep in mind you shouldn't select anything
__________________
Fumigator is offline   Reply With Quote
Old 04-28-2009, 10:45 PM   PM User | #4
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,227
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
Ummmm...I don't think that works.

If you just look at the sub-SELECT:
Code:
SELECT subQ.SubID
    FROM your_table as subQ
    GROUP BY subQ.SubID
    ORDER BY subQ.VisDate desc
    LIMIT 2
(Fumigator missed the needed field, shown there in red)
that will ONLY return *TWO* records *TOTAL*. It does NOT return two per subid.

I have seen people pass in the outer value to the inner SELECT:
Code:
SELECT T.*
FROM your_table AS T
WHERE SubID IN (SELECT subQ.SubID
    FROM your_table as subQ
    WHERE subQ.SubID = T.SubID
    GROUP BY subQ.SubID
    ORDER BY subQ.VisDate desc
    LIMIT 2)
Dunno if that works in MYSQL or not. As I read the specs, it doesn't have to work. But it might.
Old Pedant is online now   Reply With Quote
Old 04-28-2009, 10:59 PM   PM User | #5
sferson
New to the CF scene

 
Join Date: Apr 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
sferson is an unknown quantity at this point
Thank you so much for your help!

I think I'm understanding the theory here (for both suggestions that have been posted) but I'm getting errors.

For Old Pendant's code I get:

Msg 156, Level 15, State 1, Line 11
Incorrect syntax near the keyword 'SELECT'.
Msg 102, Level 15, State 1, Line 13
Incorrect syntax near ')'.

And for Fumigator's code I get:

Msg 156, Level 15, State 1, Line 6
Incorrect syntax near the keyword 'FROM'.


(If you actually want to use the line numbers, you should subtract two. It's just 'Use my_database' and then a blank line before the code.)
sferson is offline   Reply With Quote
Old 04-29-2009, 05:49 AM   PM User | #6
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,227
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
You should try to run the SQL code in a MySQL utility, even from the command line if need be, and see if you can get a better diagnostic. Too often, server-side languages such as PHP obscure the real problems.

And I believe I showed you why you get the syntax error in Fumigator's code. He was missing a field in the SELECT.

I do see one typo in my code: A missing comma in the "FROM table AS T" line. But again, if you still get an error after changing it, you should use a MySQL tool to get better diagnostics. No idea if that's the only error. I'm just typing it all in off the top of my head, with no way to debug it.
Code:
SELECT MAIN.*
FROM table AS MAIN,
    ( SELECT subid, MAX(visdate) AS oneDate
      FROM table 
      GROUP BY subid
            UNION
      SELECT T.subid, MAX(T.visdate) AS oneDate
      FROM table AS T,
           ( SELECT subid, MAX(visdate) AS topDate
             FROM table
             GROUP BY subid ) AS X
      WHERE T.subid = X.subid AND T.visdate < X.topDate
    ) AS U
WHERE MAIN.subid = U.subid AND MAIN.visdate = U.oneDate
ORDER BY MAIN.subid, MAIN.visdate DESC
Old Pedant is online now   Reply With Quote
Old 04-29-2009, 03:32 PM   PM User | #7
Fumigator
UE Antagonizer


 
Fumigator's Avatar
 
Join Date: Dec 2005
Location: Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
Posts: 7,686
Thanks: 42
Thanked 637 Times in 625 Posts
Fumigator is a glorious beacon of lightFumigator is a glorious beacon of lightFumigator is a glorious beacon of lightFumigator is a glorious beacon of lightFumigator is a glorious beacon of light
Yeap sorry I was in the middle of my post/solution and got sidetracked and had to leave really quick so I just posted the reply thinking you could work out any minor errors and fill in the blanks. I hope the concept came across though, that's the imporant thing.
__________________
Fumigator is offline   Reply With Quote
Old 04-30-2009, 03:25 PM   PM User | #8
guelphdad
Super Moderator


 
guelphdad's Avatar
 
Join Date: Mar 2006
Location: St. Catharines, Ontario Canada
Posts: 2,629
Thanks: 4
Thanked 147 Times in 138 Posts
guelphdad will become famous soon enoughguelphdad will become famous soon enough
Code:
SELECT   yt.subid,
         yt.visdate,
         score1,
         score2,
         score3,
         score4
FROM     
         yourtable yt
INNER JOIN 
         yourtable yt2
ON 
         yt.subid = yt2.subid
AND 
         yt.visdate <= yt2.visdate
GROUP BY 
         yt.subid,
         yt.visdate
HAVING   
        Count(*) <= 2
ORDER BY 
         yt.subid,
         yt.visdate DESC
by the way you need to redesign your table if it really looks similar to that. any time you find yourself with repeating columns like your scores columns, your data isn't normalized.
guelphdad is offline   Reply With Quote
Old 04-30-2009, 07:53 PM   PM User | #9
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,227
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
My head hurts.

HOW does that query work, Guelph???

AHHHH...Never mind. Figured it out.

WOW! Very clever.

If others want to figure it out, try this variation on it:
Code:
SELECT   yt.subid,
         yt.visdate,
         count(*) as what
FROM     
         yourtable yt
INNER JOIN 
         yourtable yt2
ON 
         yt.subid = yt2.subid
AND 
         yt.visdate <= yt2.visdate
GROUP BY 
         yt.subid,
         yt.visdate

ORDER BY 
         yt.subid,
         yt.visdate DESC;
Note that I purposely left out the HAVING to make it clearer. And added in a display of COUNT(*) so you can see why it all works.

Now I'm curious: Is this more efficient than the method I proposed??? Does it depend on which fields are indexed?? The HAVING clause there can't be cheap, because it means that the SELECT has to create results for *ALL* possible subid/visdate combinations and then eliminate all those where the COUNT(*) is too big. I'd guess this one depends a *LOT* on how smart the query engine is.

I found another error in my query. A missing GROUP BY:
Code:
SELECT MAIN.*
FROM yourtable AS MAIN,
    ( SELECT subid, MAX(visdate) AS oneDate
      FROM yourtable 
      GROUP BY subid
            UNION
      SELECT T.subid, MAX(T.visdate) AS oneDate
      FROM yourtable AS T,
           ( SELECT subid, MAX(visdate) AS topDate
             FROM yourtable
             GROUP BY subid ) AS X
      WHERE T.subid = X.subid AND T.visdate < X.topDate
      GROUP BY T.subid
    ) AS U
WHERE MAIN.subid = U.subid AND MAIN.visdate = U.oneDate
ORDER BY MAIN.subid, MAIN.visdate DESC
But yes, my query gives the same results as Guelph's.
Old Pedant is online now   Reply With Quote
Old 05-05-2009, 07:26 PM   PM User | #10
sferson
New to the CF scene

 
Join Date: Apr 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
sferson is an unknown quantity at this point
Thank you all!
I got pulled away from the project I need this for for a few days. Now that I am working on it again, I'll try these suggestions.

(BTW, guelphdad, thanks for looking out for my data. But the Score 1 -- Score 2 -- etc. was just to replace the actual name of the variables. It's a battery of different tests, so every score would be from a unique test.)
sferson is offline   Reply With Quote
Old 05-05-2009, 07:39 PM   PM User | #11
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,227
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
I'm not convinced that GuelphDad isn't still right about the table design.

What happens next week/month/year/decade when somebody decides to drop one of those tests from the battery and add three more???

With your DB design, you have to go in an modify the database. UGH! Ugly!

If, instead, you had a separate SCORES table then adding/removing scores requires no change at all in the DB schema.

Yes, yes, I know you can say with 100% certainty that the set of tests will never change. That's what all those people who had only two slots in their tables for SAT scores said for what, 30 years?

It's all about NORMALIZATION and you won't be the last person to think "that doesn't apply to me."
Old Pedant is online now   Reply With Quote
Old 05-05-2009, 08:01 PM   PM User | #12
sferson
New to the CF scene

 
Join Date: Apr 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
sferson is an unknown quantity at this point
Well, I guess I don't belong here because a) None of your suggestions worked for me and b) I don't see a problem with the table. What do you mean a separate Scores table? This IS the Scores table.
sferson is offline   Reply With Quote
Old 05-05-2009, 08:09 PM   PM User | #13
sferson
New to the CF scene

 
Join Date: Apr 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
sferson is an unknown quantity at this point
Nevermind, while I was off doing something else it dawned on my what you were talking about. Well, I didn't build this table. Someone else did. When we redesign the system, I'll suggest a change.
sferson is offline   Reply With Quote
Old 05-05-2009, 08:27 PM   PM User | #14
sferson
New to the CF scene

 
Join Date: Apr 2009
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
sferson is an unknown quantity at this point
Quote:
Originally Posted by Old Pedant View Post
I'm not convinced that GuelphDad isn't still right about the table design.

What happens next week/month/year/decade when somebody decides to drop one of those tests from the battery and add three more???

With your DB design, you have to go in an modify the database. UGH! Ugly!

If, instead, you had a separate SCORES table then adding/removing scores requires no change at all in the DB schema.

Yes, yes, I know you can say with 100% certainty that the set of tests will never change. That's what all those people who had only two slots in their tables for SAT scores said for what, 30 years?

It's all about NORMALIZATION and you won't be the last person to think "that doesn't apply to me."
By the way, I think you could have found a nicer way to present this. I got pretty offended at the idea that I am actually thinking "that doesn't apply to me" or 'Gee. The tests we do will never change so they're wrong.' If what someone says makes sense, I am willing to listen. I just wasn't thinking about it. I was focusing on trying to get some SQL code to work for me, not focusing on redesigning the table (and we are currently working on a system redesign anyway, so please don't tell me we need to do it NOW, because we are doing it NOW). So before going off on a rant about how someone is ignoring your advice and just being defiant, perhaps you should make your argument without the accusations.
sferson is offline   Reply With Quote
Old 05-05-2009, 08:44 PM   PM User | #15
guelphdad
Super Moderator


 
guelphdad's Avatar
 
Join Date: Mar 2006
Location: St. Catharines, Ontario Canada
Posts: 2,629
Thanks: 4
Thanked 147 Times in 138 Posts
guelphdad will become famous soon enoughguelphdad will become famous soon enough
Quote:
Originally Posted by Old Pedant View Post
Now I'm curious: Is this more efficient than the method I proposed??? Does it depend on which fields are indexed?? The HAVING clause there can't be cheap, because it means that the SELECT has to create results for *ALL* possible subid/visdate combinations and then eliminate all those where the COUNT(*) is too big. I'd guess this one depends a *LOT* on how smart the query engine is.
Yours is fine, but the one with the HAVING CLAUSE will scale better. What if the OP had asked for top 5 scores for each player for instance. Only difference in mine is changing the having clause.
guelphdad is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:57 PM.


Advertisement
Log in to turn off these ads.