Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 25
  1. #1
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Finding two most recent records for each ID

    I have a table with test scores in it. Let's say it's designed something like this:
    SubID - VisDate - Score1 - Score2 - Score3 - Score 4

    There are often multiple records for each SubID. I want to know how to pull out the most recent two records (based on VisDate) for each unique SubID.

    Can someone help?

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,618
    Thanks
    78
    Thanked 4,388 Times in 4,353 Posts
    Well, one way to do it:
    Code:
    SELECT MAIN.*
    FROM table AS MAIN,
        ( SELECT subid, MAX(visdate) AS oneDate
          FROM table 
          GROUP BY subid
                UNION
          SELECT T.subid, MAX(T.visdate) AS oneDate
          FROM table AS T
               ( SELECT subid, MAX(visdate) AS topDate
                 FROM table
                 GROUP BY subid ) AS X
          WHERE T.subid = X.subid AND T.visdate < X.topDate
        ) AS U
    WHERE MAIN.subid = U.subid AND MAIN.visdate = U.oneDate
    ORDER BY MAIN.subid, MAIN.visdate DESC
    You see it? The UNION gets the latest date for each subid unioned with the next latest date for each subid. So then we JOIN all that back to the main table to get the rest of the info.

  • #3
    UE Antagonizer Fumigator's Avatar
    Join Date
    Dec 2005
    Location
    Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
    Posts
    7,691
    Thanks
    42
    Thanked 637 Times in 625 Posts
    You can GROUP BY SubID and order by VisDate Desc. You'll want to do this in a subquery so you can select everything from the table for each row.

    Code:
    SELECT *
    FROM your_table
    WHERE SubID IN (SELECT
        FROM your_table as subQ
        GROUP BY subQ.SubID
        ORDER BY subQ.VisDate desc
        LIMIT 2)
    Keep in mind you shouldn't select anything

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,618
    Thanks
    78
    Thanked 4,388 Times in 4,353 Posts
    Ummmm...I don't think that works.

    If you just look at the sub-SELECT:
    Code:
    SELECT subQ.SubID
        FROM your_table as subQ
        GROUP BY subQ.SubID
        ORDER BY subQ.VisDate desc
        LIMIT 2
    (Fumigator missed the needed field, shown there in red)
    that will ONLY return *TWO* records *TOTAL*. It does NOT return two per subid.

    I have seen people pass in the outer value to the inner SELECT:
    Code:
    SELECT T.*
    FROM your_table AS T
    WHERE SubID IN (SELECT subQ.SubID
        FROM your_table as subQ
        WHERE subQ.SubID = T.SubID
        GROUP BY subQ.SubID
        ORDER BY subQ.VisDate desc
        LIMIT 2)
    Dunno if that works in MYSQL or not. As I read the specs, it doesn't have to work. But it might.

  • #5
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you so much for your help!

    I think I'm understanding the theory here (for both suggestions that have been posted) but I'm getting errors.

    For Old Pendant's code I get:

    Msg 156, Level 15, State 1, Line 11
    Incorrect syntax near the keyword 'SELECT'.
    Msg 102, Level 15, State 1, Line 13
    Incorrect syntax near ')'.

    And for Fumigator's code I get:

    Msg 156, Level 15, State 1, Line 6
    Incorrect syntax near the keyword 'FROM'.


    (If you actually want to use the line numbers, you should subtract two. It's just 'Use my_database' and then a blank line before the code.)

  • #6
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,618
    Thanks
    78
    Thanked 4,388 Times in 4,353 Posts
    You should try to run the SQL code in a MySQL utility, even from the command line if need be, and see if you can get a better diagnostic. Too often, server-side languages such as PHP obscure the real problems.

    And I believe I showed you why you get the syntax error in Fumigator's code. He was missing a field in the SELECT.

    I do see one typo in my code: A missing comma in the "FROM table AS T" line. But again, if you still get an error after changing it, you should use a MySQL tool to get better diagnostics. No idea if that's the only error. I'm just typing it all in off the top of my head, with no way to debug it.
    Code:
    SELECT MAIN.*
    FROM table AS MAIN,
        ( SELECT subid, MAX(visdate) AS oneDate
          FROM table 
          GROUP BY subid
                UNION
          SELECT T.subid, MAX(T.visdate) AS oneDate
          FROM table AS T,
               ( SELECT subid, MAX(visdate) AS topDate
                 FROM table
                 GROUP BY subid ) AS X
          WHERE T.subid = X.subid AND T.visdate < X.topDate
        ) AS U
    WHERE MAIN.subid = U.subid AND MAIN.visdate = U.oneDate
    ORDER BY MAIN.subid, MAIN.visdate DESC

  • #7
    UE Antagonizer Fumigator's Avatar
    Join Date
    Dec 2005
    Location
    Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
    Posts
    7,691
    Thanks
    42
    Thanked 637 Times in 625 Posts
    Yeap sorry I was in the middle of my post/solution and got sidetracked and had to leave really quick so I just posted the reply thinking you could work out any minor errors and fill in the blanks. I hope the concept came across though, that's the imporant thing.

  • #8
    Super Moderator guelphdad's Avatar
    Join Date
    Mar 2006
    Location
    St. Catharines, Ontario Canada
    Posts
    2,634
    Thanks
    4
    Thanked 148 Times in 139 Posts
    Code:
    SELECT   yt.subid,
             yt.visdate,
             score1,
             score2,
             score3,
             score4
    FROM     
             yourtable yt
    INNER JOIN 
             yourtable yt2
    ON 
             yt.subid = yt2.subid
    AND 
             yt.visdate <= yt2.visdate
    GROUP BY 
             yt.subid,
             yt.visdate
    HAVING   
            Count(*) <= 2
    ORDER BY 
             yt.subid,
             yt.visdate DESC
    by the way you need to redesign your table if it really looks similar to that. any time you find yourself with repeating columns like your scores columns, your data isn't normalized.

  • #9
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,618
    Thanks
    78
    Thanked 4,388 Times in 4,353 Posts
    My head hurts.

    HOW does that query work, Guelph???

    AHHHH...Never mind. Figured it out.

    WOW! Very clever.

    If others want to figure it out, try this variation on it:
    Code:
    SELECT   yt.subid,
             yt.visdate,
             count(*) as what
    FROM     
             yourtable yt
    INNER JOIN 
             yourtable yt2
    ON 
             yt.subid = yt2.subid
    AND 
             yt.visdate <= yt2.visdate
    GROUP BY 
             yt.subid,
             yt.visdate
    
    ORDER BY 
             yt.subid,
             yt.visdate DESC;
    Note that I purposely left out the HAVING to make it clearer. And added in a display of COUNT(*) so you can see why it all works.

    Now I'm curious: Is this more efficient than the method I proposed??? Does it depend on which fields are indexed?? The HAVING clause there can't be cheap, because it means that the SELECT has to create results for *ALL* possible subid/visdate combinations and then eliminate all those where the COUNT(*) is too big. I'd guess this one depends a *LOT* on how smart the query engine is.

    I found another error in my query. A missing GROUP BY:
    Code:
    SELECT MAIN.*
    FROM yourtable AS MAIN,
        ( SELECT subid, MAX(visdate) AS oneDate
          FROM yourtable 
          GROUP BY subid
                UNION
          SELECT T.subid, MAX(T.visdate) AS oneDate
          FROM yourtable AS T,
               ( SELECT subid, MAX(visdate) AS topDate
                 FROM yourtable
                 GROUP BY subid ) AS X
          WHERE T.subid = X.subid AND T.visdate < X.topDate
          GROUP BY T.subid
        ) AS U
    WHERE MAIN.subid = U.subid AND MAIN.visdate = U.oneDate
    ORDER BY MAIN.subid, MAIN.visdate DESC
    But yes, my query gives the same results as Guelph's.

  • #10
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you all!
    I got pulled away from the project I need this for for a few days. Now that I am working on it again, I'll try these suggestions.

    (BTW, guelphdad, thanks for looking out for my data. But the Score 1 -- Score 2 -- etc. was just to replace the actual name of the variables. It's a battery of different tests, so every score would be from a unique test.)

  • #11
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,618
    Thanks
    78
    Thanked 4,388 Times in 4,353 Posts
    I'm not convinced that GuelphDad isn't still right about the table design.

    What happens next week/month/year/decade when somebody decides to drop one of those tests from the battery and add three more???

    With your DB design, you have to go in an modify the database. UGH! Ugly!

    If, instead, you had a separate SCORES table then adding/removing scores requires no change at all in the DB schema.

    Yes, yes, I know you can say with 100% certainty that the set of tests will never change. That's what all those people who had only two slots in their tables for SAT scores said for what, 30 years?

    It's all about NORMALIZATION and you won't be the last person to think "that doesn't apply to me."

  • #12
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Well, I guess I don't belong here because a) None of your suggestions worked for me and b) I don't see a problem with the table. What do you mean a separate Scores table? This IS the Scores table.

  • #13
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Nevermind, while I was off doing something else it dawned on my what you were talking about. Well, I didn't build this table. Someone else did. When we redesign the system, I'll suggest a change.

  • #14
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Old Pedant View Post
    I'm not convinced that GuelphDad isn't still right about the table design.

    What happens next week/month/year/decade when somebody decides to drop one of those tests from the battery and add three more???

    With your DB design, you have to go in an modify the database. UGH! Ugly!

    If, instead, you had a separate SCORES table then adding/removing scores requires no change at all in the DB schema.

    Yes, yes, I know you can say with 100% certainty that the set of tests will never change. That's what all those people who had only two slots in their tables for SAT scores said for what, 30 years?

    It's all about NORMALIZATION and you won't be the last person to think "that doesn't apply to me."
    By the way, I think you could have found a nicer way to present this. I got pretty offended at the idea that I am actually thinking "that doesn't apply to me" or 'Gee. The tests we do will never change so they're wrong.' If what someone says makes sense, I am willing to listen. I just wasn't thinking about it. I was focusing on trying to get some SQL code to work for me, not focusing on redesigning the table (and we are currently working on a system redesign anyway, so please don't tell me we need to do it NOW, because we are doing it NOW). So before going off on a rant about how someone is ignoring your advice and just being defiant, perhaps you should make your argument without the accusations.

  • #15
    Super Moderator guelphdad's Avatar
    Join Date
    Mar 2006
    Location
    St. Catharines, Ontario Canada
    Posts
    2,634
    Thanks
    4
    Thanked 148 Times in 139 Posts
    Quote Originally Posted by Old Pedant View Post
    Now I'm curious: Is this more efficient than the method I proposed??? Does it depend on which fields are indexed?? The HAVING clause there can't be cheap, because it means that the SELECT has to create results for *ALL* possible subid/visdate combinations and then eliminate all those where the COUNT(*) is too big. I'd guess this one depends a *LOT* on how smart the query engine is.
    Yours is fine, but the one with the HAVING CLAUSE will scale better. What if the OP had asked for top 5 scores for each player for instance. Only difference in mine is changing the having clause.


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •