...

View Full Version : My query isn't working but it would in MSSQL



craigwg
01-04-2012, 11:00 PM
I wrote a query to roll up some information. I normally work with MSSQL and this is exactly how I'd do it there. But Something is off with the group by statement or maybe the join. Here is the query:



select month(InteractionDate), sum(Charged+Tip+PrintPurchase) as Collected, month(PurchaseDate), sum(Cost) as Spent
from admin_Interactions i
LEFT JOIN admin_Expenses e on month(i.Interactiondate) = month(e.PurchaseDate)
where i.Deleted=0 and e.Deleted=0
group by month(InteractionDate), month(PurchaseDate)


The problem is that the join is producing multiple rows of data so the sum is adding it all up and all the numbers are too big. So I understand the issue, and I understand date fields are not ideally the best to join on. But in this case it is the only link between the two tables. It should work. I want to know all the income for January compared to all the expenses for January. January is common. Is there something happening in the month() function that I'm not understanding? How can I get what I'm after?

Thanks for the help.
Craig

Old Pedant
01-05-2012, 12:01 AM
First of all, you don't HAVE a LEFT JOIN, given that query.

Look here:
http://www.codingforums.com/showthread.php?p=818192#post818192

That post was originally posted (by me) in a SQL Server forum, as a matter of fact.

When you put e.Deleted=0 into the WHERE part of that query, you effectively converted your LEFT JOIN to an INNER JOIN.

But even aside from that, I completely disagree that this would work in SQL Server any differently than it does in MySQL.

The results you are getting are 100% expected, in *ANY* relational database.

Consider these records in the two tables (showing only Charged for simplicity):


InteractionDate Charged
1/1/2011 100
1/2/2011 200

purchaseDate Cost
1/10/2011 50
1/11/2011 75

So use MONTH(...) instead and you have:


Month(iD) Charged
1 100
1 200

Month(pD) Cost
1 50
1 75

Join them on Month(...) and you get these complete records:

Month(iD) Charged Month(pD) Cost
1 100 1 50
1 200 1 50
1 100 1 75
1 200 1 75

And so your sums will indeed be double what you expected.

And the more records there are in the given month, the more this is compounded.

This is called the CARTESIAN PRODUCT of all matching records.

You can *NOT* code this query this way and make it work in any relational DB.

craigwg
01-05-2012, 12:04 AM
That's actually quite helpful to see it spelled out like that. Thanks. Ok...so...what recommendations exist. Does a cross reference table make sense in this case?

Old Pedant
01-05-2012, 12:09 AM
I'm pretty sure this is what you are after:


SELECT * FROM
( SELECT MONTH(InteractionDate) AS collectionMonth, SUM(Charged+Tip+PrintPurchase) as collectionTotal
FROM admin_Interactions
WHERE deleted = 0
GROUP BY MONTH(InteractionDate)
) AS collections,
( SELECT MONTH(purchaseDate) AS purchaseMonth, SUM(cost) AS purchaseTotal
FROM admin_Expenses
WHERE deleted = 0
GROUP BY MONTH(purchaseDate)
) AS purchases
WHERE collections.collectionMonth = purchases.purchaseMonth
ORDER BY collections.collectionMonth


You *could* change that to a LEFT JOIN if you really think there might be some month where you have collections without having any purchases. (Or a right join if vice versa.)

craigwg
01-05-2012, 12:25 AM
Ah yes. Derived tables. I always have a hard time recognizing when is right to use them. I'll try this out and report back tomorrow. We actually do have some months with no expenses and sadly some months without income. Just starting our little business and trying to get the peices in place for the future. Thanks for the help.

Old Pedant
01-05-2012, 01:11 AM
Unfortunately, MySQL doesn't suppor FULL OUTER JOIN, which would be what you would need if you have some months with no income, some with not expenses.

So there's no easy way for you to make sure each month shows up in your result with just what you have there.

There are two possible ways to fix this: (1) Have a table of months and make it the left most table and join the two inner SELECTs to it, (2) create a pseudo-table of months.

The separate table would probably be a good idea, but we can hack at it with another pseudo table.

But as long as we are improving thints...What happens when you want to get the YEARs separately??? As written, that prior code will lump January 2012 with January 2011 and January 2013.

So:


SELECT allMonths.theYear, allMonths.theMonth, collections.collectionTotal, purchases.purchaseTotal
FROM
( SELECT YEAR(InteractionDate) AS theYear, MONTH(InteractionDate) AS theMonth FROM admin_Interactions
UNION
SELECT YEAR(purchaseDate), MONTH(purchaseDate) FROM admin_Expenses
) AS allMonths
LEFT JOIN
( SELECT YEAR(interactionDate) AS Y, MONTH(InteractionDate) AS M, SUM(Charged+Tip+PrintPurchase) as collectionTotal
FROM admin_Interactions
WHERE deleted = 0
GROUP BY Year(interactionDate), MONTH(InteractionDate)
) AS collections ON collection.M = allMonths.theMonth AND collections.Y = allMonths.theYear
LEFT JOIN
( SELECT YEAR(purchaseDate) AS Y, MONTH(purchaseDate) AS M, SUM(cost) AS purchaseTotal
FROM admin_Expenses
WHERE deleted = 0
GROUP BY YEAR(purchaseDate), MONTH(purchaseDate)
) AS purchases ON purchases.M = allMonths.theMonth AND purchases.Y = allMonths.theYear
ORDER BY allMonths.theYear, allMonths.theMonth

When you UNION, you automatically get DISTINCT values (unless you use UNION ALL) so the first subquery there will indeed get you all months--one time--that are in either of the two tables.

By this point, the query is getting ugly enough that I would probably rewrite it to join 3 VIEWs together. Each view simply being one of the inner queries.

craigwg
01-05-2012, 04:12 PM
You're a rockstar. This stuff is awesome for the level that I am at. I agree, the query is getting ugly and clunky. I like the idea of making each join it's own view. I haven't had a chance to do these ideas but I totally see what you're doing. Thanks again.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum