Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 02-08-2012, 04:48 AM   PM User | #1
christopherc
New to the CF scene

 
Join Date: Jul 2010
Location: Philippines
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
christopherc is an unknown quantity at this point
How to select rows with latest date from 2 tables

I have 2 tables:
1. users - user_id(PK), firstname, middlename, lastname
2. user_shift_schedule - shift_code(PK), user_id(FK), effectivity_date

I want to get all rows with the latest effectivity_date for each user_id.

This is what I've got so far:

SELECT
users.user_id
,users.firstname
,users.middlename
,users.lastname
,user_shift_schedule.shift_id
,MAX(user_shift_schedule.effectivity_date)

FROM users
JOIN user_shift_schedule

ON users.user_id=user_shift_schedule.user_id

GROUP BY user_shift_schedule.user_id


This would give me a somewhat desired result only the shift_id is incorrect, seems like it's selecting the most recently saved shift_id and not the shift_id that is of the same row as the latest effectivity_date of each user_id.


Any suggestion is greatly appreciated.

Thank you!

Last edited by christopherc; 02-08-2012 at 05:40 AM..
christopherc is offline   Reply With Quote
Old 02-08-2012, 05:51 AM   PM User | #2
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,188
Thanks: 59
Thanked 3,995 Times in 3,964 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
The only reason that query works at all is because MySQL is mildly brain-dead.

Any other DB would insist that your GROUP BY clause read
Code:
GROUP BY users.user_id,users.firstname,users.middlename,users.lastname,user_shift_schedule.shift_id
But MySQL allows you to omit one or more of the normally required fields.

The problem is, when it does so, it RANDOMLY PICKS the value for the fields that you did *NOT* specify.

To create a simpler example, let's just show with 3 fields in the select.

First, let's show the records you would have without the GROUP BY:
Code:
user_id  | shift_id | effectivity_date
     17 |        2 | Jan 1
     17 |        2 | Feb 1
     17 |        3 | Mar 1
     17 |        3 | Apr 1
Now, when you do
Code:
SELECT user_id, shift_id, MAX(effectivity_date) ... GROUP BY user_id
MySQL produces this:
Code:
user_id  | shift_id | effectivity_date
     17 |    2 or 3 | Apr 1
You see? You have *NOT SPECIFIED* which shift_id you want, so MySQL feels free to pick ANY ONE THAT IS AVAILABLE. Usually, it will simply pick the first one it finds (but that's not universal), so that's why you get back
Code:
user_id  | shift_id | effectivity_date
     17 |        2 | Apr 1
*IF* you had properly coded
Code:
SELECT user_id, shift_id, MAX(effectivity_date) 
... GROUP BY user_id, shift_id
*THEN* MySQL would give you the correct results (as would any other DB):
Code:
user_id  | shift_id | effectivity_date
     17 |        2 | Feb 1
     17 |        3 | Apr 1
In other words, now you get TOO MANY results. But that's what the DB *should* be giving you.

The right answer is to ALWAYS use the proper GROUP BY (that is, include *all* fields that are not part of an aggregate function--MAX, MIN, AVG, COUNT, etc.) but then fix the query in other ways.
Code:
SELECT U.user_id, U.firstname, U.middlename, U.lastname, S.shift_id, S.effectivity_date
FROM users AS U, 
     user_shift_schedule AS S,
     ( SELECT user_id, MAX(effectivity_date) AS maxDate
       FROM user_shift_schedule
       GROUP BY user_id ) AS M
WHERE U.user_id = S.user_id
AND S.user_id = M.user_id
AND S.effectivity_date = M.maxDate
Now, this is still not perfect. If there happen to be *TWO* effectivity_date values that are the same as MAX(effectivity_date) for the given user_id, then you *will* get two records.

How you break a "tie" like that depends on other data in the tables, and you haven't given me enough to go on to know (a) whether you will need to break ties and/or (b) how to do so if needed.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is online now   Reply With Quote
Old 02-08-2012, 06:24 AM   PM User | #3
christopherc
New to the CF scene

 
Join Date: Jul 2010
Location: Philippines
Posts: 5
Thanks: 0
Thanked 0 Times in 0 Posts
christopherc is an unknown quantity at this point
Excellent explanation sir, thank you very much!

I tried your code and it worked perfectly.

Regarding having the same effectivity date, I think I'll just change column type to timestamp so that there'll never be dates of the same value, what do you think?

I have a question about your last code, within the FROM portion, third item:
( SELECT user_id, MAX(effectivity_date) AS maxDate
FROM user_shift_schedule
GROUP BY user_id ) AS M

What does this do exactly? Is it making a new temporary table?
christopherc is offline   Reply With Quote
Old 02-08-2012, 08:10 AM   PM User | #4
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,188
Thanks: 59
Thanked 3,995 Times in 3,964 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
Quote:
Regarding having the same effectivity date, I think I'll just change column type to timestamp so that there'll never be dates of the same value, what do you think?
Or to DATETIME and record the value to the second, yes. That's certainly viable if you can ensure that those values won't ever get duplicates.

Quote:
Code:
( SELECT user_id, MAX(effectivity_date) AS maxDate
       FROM user_shift_schedule
       GROUP BY user_id ) AS M
What does this do exactly? Is it making a new temporary table?
In essence, yes. It doesn't really create a temp table, but it creates a set of records in memory that satisfy the SELECT. So yes, it's easier to think of it as creating a temp table or, perhaps better, a pseudo-table. (Temp tables in MySQL are also possible, but they last so long as the same connection is open--meaning you could use the same temp table in multiple SELECTs or UPDATEs; in this above code, the pseudo-table lasts only so long as the query is executing.)
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is online now   Reply With Quote
Reply

Bookmarks

Tags
latest date, multiple entry each user, multiple tables

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:25 AM.


Advertisement
Log in to turn off these ads.