Having problems with UNION and GROUP, help greatly appreciated
Good morning all!
I'm having an issue, and I think I know what my problem is, but don't know quite enough to solve it.
Basically, I have three tables:
pages (contains all the information to build a webpage, including ID, pageName which is the link text, webName which is the html address, and sortBy which lets them choose where page should fall, numerically)
pagesCats (contains a category ID, value which is the category name and subSetOf which specifies the parent category ID)
pagesIndex (contains a unique ID, a pageID (matches with pages.ID) and a catID (matches with pagesCats.ID)
I wrote this crazy query:
(SELECT webName, pageName AS titleSort, pageName AS title, pages.ID, catID, pageName, sortBy FROM pages JOIN pagesIndex ON pages.ID = pagesIndex.pageID WHERE catID = '2' ) UNION ( SELECT webName, value AS titleSort, value AS title, pages.ID, catID, pageName, sortBy FROM pagesCats JOIN pagesIndex ON pagesCats.ID = pagesIndex.catID JOIN pages ON pagesIndex.pageID = pages.ID WHERE subsetOf = '2' GROUP BY catID ) ORDER BY pageName='Parent Category Name' DESC, sortBy, titleSort
which is supposed to:
- pull out the names of each of the categories, as well as get a link to the appropriate page
- pull out the name and web link to all the pages at the same level in the hierarchy
- sort the pages by anything which matches the category name, then by sortBy, then alphabetical
My problem is, that when I GROUP the pages which all belong to the same category, it just squishes them down, and I want the category to link to the first page which matches. I don't think you can sort a GROUP BY clause, and I'm probably using it wrong, but I can't figure out how to just get the top result from the query. I've been pulling my hair out over this for days, and just can't come up with anything better.
I'm using the call in a looping php function, so this one mySQL call can basically build the entire navigation of the site (about 500-odd pages). Other than the category names not linking to the right page, it's working brilliantly.
Does anyone know how I can solve this one? Any help would be massively appreciated. If you have any questions, or if there is anything I'm not explaining well, please let me know.
I'm so sorry, I'm clearly having one of those days.
I've attached a new zip with all 3 tables included.
I'll try to explain my problem a bit better. The problem part of the query is this bit:
SELECT webName, value AS titleSort, value AS title, pages.ID, catID, pageName, sortBy FROM pagesCats JOIN pagesIndex ON pagesCats.ID = pagesIndex.catID JOIN pages ON pagesIndex.pageID = pages.ID WHERE subsetOf = '2' GROUP BY catID
If I remove the GROUP BY (and add an ORDER, for clarity):
SELECT webName, value AS titleSort, value AS title, pages.ID, catID, pageName, sortBy FROM pagesCats JOIN pagesIndex ON pagesCats.ID = pagesIndex.catID JOIN pages ON pagesIndex.pageID = pages.ID WHERE subsetOf = '2' ORDER BY sortBy, catID
I can see that the webName for 'What's Happening' should be 'whats-happening' with a sortBy=0, but when I add the group back in, it is returning 'ScugogSportsHallofFame' with a sortBy=3
Well, first of all, you should *NEVER* GROUP BY only one field unless you really really really understand what MySQL does to you when you do so.
The rule for all other database is "Always GROUP BY all field except those used in an aggregate function" (where aggregate functions are such as COUNT(), MIN(), SUM(), that act on multiple rows).
MOST database won't allow you to group any other way. MySQL does, but then it has some funky rules about what happens. Specifically, any field that you do *NOT* group by is left open to MySQL to decide WHICH row it should pull that fields value from.
Example:
Code:
name | price
ford | 10000
ford | 20000
audi | 20000
audi | 30000
If you then do
Code:
SELECT name, price GROUP BY name
MySQL could perfectly reasonably return
Code:
name | price
ford | 20000
audi | 20000
That is, the high price for one and the low price for the other.
SO... You are playing with fire when you use GROUP BY in the way you are doing.
NOW... let me see if I can figure out what you are after... back later.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
SELECT webName, value AS titleSort, value AS title, pageID, X.catID, pageName, sortBy
FROM pages, pagesCats, pagesIndex,
( SELECT catID, MIN(sortBy) AS minsort
FROM pagesIndex,pages
WHERE pages.ID=pagesIndex.pageID GROUP BY catID ) AS X
WHERE pagesIndex.catID = X.catID
AND pagesindex.pageID = pages.ID
AND pages.sortBy = X.minsort
AND pagesCats.ID = X.catID
AND subsetof = '2'
ORDER BY X.minsort, pageID
But it does get duplicate catID/sortBy values. I'm not sure if you want that or not.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Thanks Old Pendant! That is close, but it shouldn't be showing duplicate catIDs. The subsetOf establishing the hierarchy of the items in the pagesCats table, so it ties to the pagesCats.ID
I've been doing some more testing, and there is actually one row missing from this result, vs. my original query, where pageName='Area Information'. I think it is now just getting the section names, vs. merging those names with the results from the pages database. I'll keep working here, and see if I can solve this one.
Thanks again for all the help, I think I'm getting there.
Okay, with Old Pendant's help, I've finally got something which seems to be pretty close to what I was looking for? I've been looking at it too long to get into all the nit-picking details, but this is certainly much better than what I started with.
For anyone who is curious (or perhaps this will somehow help someone else), here is the query I've ended up with:
(
SELECT webName, pageName AS titleSort, pageName AS title, pages.ID, catID, pageName, sortBy
FROM pages
JOIN pagesIndex ON pages.ID = pagesIndex.pageID
WHERE catID = '2'
)
UNION (
SELECT webName, value AS titleSort, value AS title, pageID, X.catID, pageName, '0'
FROM pages, pagesCats, pagesIndex, (
SELECT catID, MIN( sortBy ) AS minsort
FROM pagesIndex, pages
WHERE pages.ID = pagesIndex.pageID
GROUP BY catID
) AS X
WHERE pagesIndex.catID = X.catID
AND pagesIndex.pageID = pages.ID
AND pages.sortBy = X.minsort
AND pagesCats.ID = X.catID
AND subsetof = '2'
GROUP BY catID
)
I'm sure it's a nightmare, but it seems to be doing the trick.
Thanks again for all your help.