CodingForums.com

CodingForums.com (http://www.codingforums.com/index.php)
-   MySQL (http://www.codingforums.com/forumdisplay.php?f=7)
-   -   Having problems with UNION and GROUP, help greatly appreciated (http://www.codingforums.com/showthread.php?t=257768)

katyD 04-18-2012 03:20 PM

Having problems with UNION and GROUP, help greatly appreciated
 
Good morning all!
I'm having an issue, and I think I know what my problem is, but don't know quite enough to solve it.

Basically, I have three tables:
pages (contains all the information to build a webpage, including ID, pageName which is the link text, webName which is the html address, and sortBy which lets them choose where page should fall, numerically)
pagesCats (contains a category ID, value which is the category name and subSetOf which specifies the parent category ID)
pagesIndex (contains a unique ID, a pageID (matches with pages.ID) and a catID (matches with pagesCats.ID)

I wrote this crazy query:

(SELECT webName, pageName AS titleSort, pageName AS title, pages.ID, catID, pageName, sortBy FROM pages JOIN pagesIndex ON pages.ID = pagesIndex.pageID WHERE catID = '2' ) UNION ( SELECT webName, value AS titleSort, value AS title, pages.ID, catID, pageName, sortBy FROM pagesCats JOIN pagesIndex ON pagesCats.ID = pagesIndex.catID JOIN pages ON pagesIndex.pageID = pages.ID WHERE subsetOf = '2' GROUP BY catID ) ORDER BY pageName='Parent Category Name' DESC, sortBy, titleSort

which is supposed to:
- pull out the names of each of the categories, as well as get a link to the appropriate page
- pull out the name and web link to all the pages at the same level in the hierarchy
- sort the pages by anything which matches the category name, then by sortBy, then alphabetical

My problem is, that when I GROUP the pages which all belong to the same category, it just squishes them down, and I want the category to link to the first page which matches. I don't think you can sort a GROUP BY clause, and I'm probably using it wrong, but I can't figure out how to just get the top result from the query. I've been pulling my hair out over this for days, and just can't come up with anything better.

I'm using the call in a looping php function, so this one mySQL call can basically build the entire navigation of the site (about 500-odd pages). Other than the category names not linking to the right page, it's working brilliantly.

Does anyone know how I can solve this one? Any help would be massively appreciated. If you have any questions, or if there is anything I'm not explaining well, please let me know.

Thanks in advance,

Old Pedant 04-18-2012 08:18 PM

My head hurts.

Can you *START* by showing us your DB schema for the three tables? I don't even know which fields are coming from which tables.

katyD 04-26-2012 03:02 PM

1 Attachment(s)
Sorry for the sore head

The database structure (simplified), is:

TABLE: pages
ID
pageName
webName
sortBy

TABLE: pagesCats
ID
value
subsetOf

TABLE: pagesIndex
ID
pageID
catID

I've attached a zip of the database, thought this might be of use as well.

Old Pedant 04-26-2012 11:29 PM

That zip file only had the "pages" table in it.

But maybe if you could give an example of some sample data and then the results you like to see from that sample it would help even more.

katyD 04-27-2012 01:00 AM

1 Attachment(s)
I'm so sorry, I'm clearly having one of those days.
I've attached a new zip with all 3 tables included.

I'll try to explain my problem a bit better. The problem part of the query is this bit:
SELECT webName, value AS titleSort, value AS title, pages.ID, catID, pageName, sortBy FROM pagesCats JOIN pagesIndex ON pagesCats.ID = pagesIndex.catID JOIN pages ON pagesIndex.pageID = pages.ID WHERE subsetOf = '2' GROUP BY catID

If I remove the GROUP BY (and add an ORDER, for clarity):

SELECT webName, value AS titleSort, value AS title, pages.ID, catID, pageName, sortBy FROM pagesCats JOIN pagesIndex ON pagesCats.ID = pagesIndex.catID JOIN pages ON pagesIndex.pageID = pages.ID WHERE subsetOf = '2' ORDER BY sortBy, catID

I can see that the webName for 'What's Happening' should be 'whats-happening' with a sortBy=0, but when I add the group back in, it is returning 'ScugogSportsHallofFame' with a sortBy=3

Does that help?

Old Pedant 04-27-2012 01:57 AM

Well, first of all, you should *NEVER* GROUP BY only one field unless you really really really understand what MySQL does to you when you do so.

The rule for all other database is "Always GROUP BY all field except those used in an aggregate function" (where aggregate functions are such as COUNT(), MIN(), SUM(), that act on multiple rows).

MOST database won't allow you to group any other way. MySQL does, but then it has some funky rules about what happens. Specifically, any field that you do *NOT* group by is left open to MySQL to decide WHICH row it should pull that fields value from.

Example:
Code:

name | price
ford | 10000
ford | 20000
audi | 20000
audi | 30000

If you then do
Code:

SELECT name, price GROUP BY name
MySQL could perfectly reasonably return
Code:

name | price
ford | 20000
audi | 20000

That is, the high price for one and the low price for the other.

SO... You are playing with fire when you use GROUP BY in the way you are doing.

NOW... let me see if I can figure out what you are after... back later.

Old Pedant 04-27-2012 02:10 AM

Okay, uncle. I don't understand what your subsetOf field is for.

It's clearly not referring to pages.ID and it doesn't *SEEM* to be referring to pagescats.ID.

And WHY is it declared as VARCHAR(255) when none of the values seem to be anything but numbers??

Old Pedant 04-27-2012 02:30 AM

I think this is *close* to what you want:
Code:

SELECT webName, value AS titleSort, value AS title, pageID, X.catID, pageName, sortBy
FROM pages, pagesCats, pagesIndex,
    ( SELECT catID, MIN(sortBy) AS minsort
      FROM pagesIndex,pages
      WHERE pages.ID=pagesIndex.pageID GROUP BY catID ) AS X
WHERE pagesIndex.catID = X.catID
  AND pagesindex.pageID = pages.ID
  AND pages.sortBy = X.minsort
  AND pagesCats.ID = X.catID
  AND subsetof = '2'
ORDER BY X.minsort, pageID

But it does get duplicate catID/sortBy values. I'm not sure if you want that or not.

katyD 04-27-2012 04:06 PM

Thanks Old Pendant! That is close, but it shouldn't be showing duplicate catIDs. The subsetOf establishing the hierarchy of the items in the pagesCats table, so it ties to the pagesCats.ID

katyD 04-27-2012 05:27 PM

I've been doing some more testing, and there is actually one row missing from this result, vs. my original query, where pageName='Area Information'. I think it is now just getting the section names, vs. merging those names with the results from the pages database. I'll keep working here, and see if I can solve this one.
Thanks again for all the help, I think I'm getting there.

katyD 04-27-2012 08:16 PM

Okay, with Old Pendant's help, I've finally got something which seems to be pretty close to what I was looking for? I've been looking at it too long to get into all the nit-picking details, but this is certainly much better than what I started with.

For anyone who is curious (or perhaps this will somehow help someone else), here is the query I've ended up with:

(
SELECT webName, pageName AS titleSort, pageName AS title, pages.ID, catID, pageName, sortBy
FROM pages
JOIN pagesIndex ON pages.ID = pagesIndex.pageID
WHERE catID = '2'
)
UNION (

SELECT webName, value AS titleSort, value AS title, pageID, X.catID, pageName, '0'
FROM pages, pagesCats, pagesIndex, (

SELECT catID, MIN( sortBy ) AS minsort
FROM pagesIndex, pages
WHERE pages.ID = pagesIndex.pageID
GROUP BY catID
) AS X
WHERE pagesIndex.catID = X.catID
AND pagesIndex.pageID = pages.ID
AND pages.sortBy = X.minsort
AND pagesCats.ID = X.catID
AND subsetof = '2'
GROUP BY catID
)

I'm sure it's a nightmare, but it seems to be doing the trick.
Thanks again for all your help.


All times are GMT +1. The time now is 09:59 AM.

Powered by vBulletin®
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.