View Full Version : How many rows in a query have this condition?
johnnyb 08282006, 05:01 AM Hi,
I have a query which works great, unfortunately I now have another tricky requirement do do with pagination of search results.
What I need to do is like this, (pseudocode):
SELECT some stuff, (number of rows in this query that have condition=condition) AS TYP_Limit FROM tables WHERE stuff LIMIT 0,10
I'm running a search on a directory database. I need to know how many of the listings, (rows), in the search result, (the whole thing, not just the 10 rows I'm getting at the moment), are basic listings and how many have higher ranks, (it'll be something like Listing_type > 1). I could just run the whole thing as a subquery but it's an extremely long and complex query involving circular mathematics and the such so I want to keep the server overhead down.
Ideally I'd like to be able to have the query run a subquery on itself. Is this possible?
John
Beagle 08282006, 04:34 PM it seems like you want to run that query once. But, if you're selecting 10 rows with the main query, your subquery will run 10 times. If it's a complex subquery, you don't want to do that.
Basically, you want two completely unrelated pieces of information, run two queries.
johnnyb 08282006, 07:03 PM Yeah, I don't want to run the subquery 10 times if I can avoid it unless I can make it really simple. Unfortunately, the two pieces of information are related  I need to know how many or the rows in the result set from the main query have a LST_Typ > 1. I remember back in the day of using ASP with Access I could run a query on a result, but I don't think this is possible with php/mysql. Do you know a better way?
guelphdad 08282006, 07:28 PM SELECT some stuff,
count(case when columnname_meets_condition then 1 else null end) AS TYP_Limit
FROM tables
WHERE stuff
LIMIT 0,10
Beagle 08282006, 07:33 PM well, hold on. If you NEED all that data, then you just iterate in your application and count as you go.
But if you ONLY need the count, then you use a group by statement. So here's my question: do you need this:

id  column1  column2  LST_Typ
1  asddas  adasd  1
2  asdaSD  dfdsffff  5
3  an3nfe  32rrfs  0

AND a count of how many records have LST_Typ > 0
OR, do you need this:

10

as in there are 10 rows that meet all your criteria.
If the first thing is what you want, there's no reason to do that calculation on the DB side. It's just inefficient, and you end up sending more data over than you need. You should keep track of what you want as you process your result set.
If you need the second one, you can just count(*) with all the necessary conditions.
johnnyb 08282006, 08:00 PM It's that easy is it guelphdad? I'll try it out. Beagle, I need your first example only for the first 10, (or whatever my limit is), rows, then beyond that I need to know that there are 42 more rows, (or 52 in total), that have LST_Typ > 0.
I'll give guelphdad's code a run and let you all know how it turns out.
guelphdad 08282006, 08:38 PM You can also use mysql_affected_rows to get how many rows would have been returned without the specific limit being there. that would tell you 52 rows in total would match your criteria. in this case you don't even need the count in the select statement.
johnnyb 08312006, 09:49 PM So, I've tried both solutions and here are the problems I am having:
the COUNT(CASE.....) from guelphdad  for some reason it returns a different value on every row. I'm not sure why, but none of them are right.
Running a separate query using COUNT(*) FROM tables WHERE all_of_my_conditions is slightly closer but still isn't returning correct numbers. I think that this is because I use a GROUP BY in the main query, so I tried putting a group by on the count query and I always get the value 2. hmmm. perhaps if I select the IDs as well then group by ID it'll work, but it will be slower.
I am using SQL_CALC_FOUND_ROWS / FOUND_ROWS() already to return the total length of the result, however I now need to figure out how many rows in the result, (total result  disregarding the limit clause), have a condition as I described above.
I'm going to keep working on the angle with a separate COUNT query and will post here if I figure anything out. If anyone thinks of anything else please let me know.
johnnyb 08312006, 11:58 PM Ok, I'm having some problems with COUNT().
Should:
SELECT table1.id, COUNT(*) FROM tables WHERE conditions GROUP BY table1.id
not have the same value in the count field of every row? And shoudn't that value be the same as the number of rows returned by the SELECT?
I must be wrong here but need some help figuring out what I am supposed to be doing.
Thanks in advance.
John
guelphdad 09012006, 12:16 AM show us some sample rows and your exact query.
johnnyb 09012006, 12:41 AM Ok,
Here is the main query, (it's huge  remember the circular mathematics?):
SELECT SQL_CALC_FOUND_ROWS
DISTINCT LISTING.idLST,
LISTING.*,
LISTING_TYPE.*,
CATEGORY.CAT_Name,
CATEGORY.idCAT,
STATE.STATE_Name,
STATE.idSTATE,
ACOS(
COS(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS((SELECT DISTINCT Longitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
*COS(RADIANS(IF(LISTING.LST_MAP_Accuracy > 5 AND LISTING.LST_Lon IS NOT NULL, LISTING.LST_Lon, ZIPCODE.Longitude)))
+
COS(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*SIN(RADIANS((SELECT DISTINCT Longitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
*SIN(RADIANS(IF(LISTING.LST_MAP_Accuracy > 5 AND LISTING.LST_Lon IS NOT NULL, LISTING.LST_Lon, ZIPCODE.Longitude)))
+
SIN(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*SIN(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
)
* 3963 AS dist
FROM LISTING
JOIN LISTING_TYPE ON LISTING_TYPE.idTYPE = LISTING.LISTING_TYPE_idTYPE
JOIN LISTING_TO_CAT ON LISTING_TO_CAT.LISTING_idLST = LISTING.idLST
JOIN CATEGORY ON CATEGORY.idCAT = LISTING_TO_CAT.CAT_idCAT
JOIN STATE ON STATE.idSTATE = LISTING.STATE_idSTATE
JOIN ZIPCODE ON ZIPCODE.ZIPCode = IF((LISTING.LST_Geo_Zip < 99 OR ISNULL(LISTING.LST_Geo_Zip)), LISTING.LST_Zip, LISTING.LST_Geo_Zip) WHERE ACOS(
COS(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS((SELECT DISTINCT Longitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
*COS(RADIANS(IF(LISTING.LST_MAP_Accuracy > 5 AND LISTING.LST_Lon IS NOT NULL, LISTING.LST_Lon, ZIPCODE.Longitude)))
+
COS(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*SIN(RADIANS((SELECT DISTINCT Longitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
*SIN(RADIANS(IF(LISTING.LST_MAP_Accuracy > 5 AND LISTING.LST_Lon IS NOT NULL, LISTING.LST_Lon, ZIPCODE.Longitude)))
+
SIN(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*SIN(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
)
* 3963
< 5 GROUP BY LISTING.idLST ORDER BY LISTING_TYPE.idTYPE Desc, dist, LISTING.LST_Name LIMIT 0,10
And this is my last attempt at a query to find out how many listings have LISTING_TYPE_idTYPE > 1 :
SELECT COUNT(*) AS TYP_Limit FROM LISTING JOIN ZIPCODE ON ZIPCODE.ZIPCode = IF((LISTING.LST_Geo_Zip < 99 OR ISNULL(LISTING.LST_Geo_Zip)), LISTING.LST_Zip, LISTING.LST_Geo_Zip)
WHERE ACOS(
COS(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS((SELECT DISTINCT Longitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
*COS(RADIANS(IF(LISTING.LST_MAP_Accuracy > 5 AND LISTING.LST_Lon IS NOT NULL, LISTING.LST_Lon, ZIPCODE.Longitude)))
+
COS(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*SIN(RADIANS((SELECT DISTINCT Longitude FROM ZIPCODE WHERE ZIPCode=90210)))
*COS(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
*SIN(RADIANS(IF(LISTING.LST_MAP_Accuracy > 5 AND LISTING.LST_Lon IS NOT NULL, LISTING.LST_Lon, ZIPCODE.Longitude)))
+
SIN(RADIANS((SELECT DISTINCT Latitude FROM ZIPCODE WHERE ZIPCode=90210)))
*SIN(RADIANS(IF(LISTING.LST_Map_Accuracy > 5 AND LISTING.LST_Lat IS NOT NULL, LISTING.LST_Lat, ZIPCODE.Latitude)))
)
* 3963
< 5
AND LISTING.LISTING_TYPE_idTYPE > 1 GROUP BY LISTING.idLST"
Sample rows from the second query:
TYP_Limit
=======
2
2
1
2
2
2
2
3
Do you need sample rows from the tables, they're pretty big. Essentially there's a table called LISTING that has idLST, (the ID), a bunch of other address stuff, latitude & longitude numbers, a column indicating the accuracy of the lat & lon numbers, a ZIP code, a second ZIP code, (called LST_Geo_Zip), and LISTING_TYPE_idTYPE, which is the level, (basic, premium, featured). There's also a table called ZIPCODE with ZIPCode, and lat & lon for that ZIP code. The query takes a ZIP code and a distance, finds the lat & lon of that ZIP code from the ZIPCODE table, then searches for listings within the given distance of that lat & lon using first the lat & lon numbers in the LISTING table, then if those aren't there or the accuracy number isn't high enough it'll pull a lat & lon set from the ZIPCODE table based on the row's LST_Geo_Zip. It returns the results ordered by LISTING_TYPE_idTYPE descending, then distance ascending.
Phew... quite the explanation, hopefully it makes sense. What I need to know, (if you haven't figured it out already), is the cutoff point in the result of the main query where LISTING_TYPE_idTYPE is no longer greater than 1.
guelphdad 09012006, 02:41 PM One of the things that may be throwing your query off is the second line:
DISTINCT LISTING.idLST,
you do realize that distinct will not work on that single column and will work across every single column in your select statement and may thus return more rows than you are expecting.
Your group by clause can also cause problems. for all other databases every nonaggregate column from your select must be in your group by clause.
mysql allows you to leave columns out but warns of unpredictable results in the manual under GROUP BY HIDDEN COLUMNS.
johnnyb 09012006, 05:36 PM Hmmm. The reason the GROUP BY was on the first query was because I was getting too many results ... perhaps if I remove both the DISTINCT and GROUP BY I'll be fine, (and have less processing to do).
What I've done for now with the second query is simply run a query retrieving the IDs and done a mysql_count_rows() in PHP. It works but I'll keep looking for something more elegant, hopefully that doesn't require me to run a separate query.
johnnyb 09012006, 09:40 PM Wow. I've done some more reading and I never knew that about GROUP BY clauses. Turns out I'm using GROUP BY in a way I shouldn't really be doing so that's why the COUNT() isn't working, but I think that there's no other way to achieve the desired end result so I'll leave it. I did remove the DISTINCT keyword from the beginning of the SELECT though, it was useless and probably slowing things down.
Also, my twoquery method that I'm using means that I don't always have to return a bunch of extra data in my queries, I only get the rows I need.
I have another problem that I am trying to figure out but I'll start a new thread since it's entirely unrelated to this one. Thanks a lot for your help, you've made me learn a few things today.
John

