...

View Full Version : Running a crosstab query on an Access database, without the JET-specific SQL



Grant Palin
07-22-2004, 10:41 PM
I've got an Access database for municipality data...See the included screenshot of the relationship windows.

The Municipalities table contains information about each municipality...each has a unique ID, a name, a type ID which refers to the MunTypes table (a municipality can be a city, town, village, etc).

The MunData table has 3 columns: one for the municipality ID, one for year, and one for total. The total represents the population total for that municipality in that year.

In the mundata table, there are over 4000 rows...There are abut 150 municipalities, with data for the last 28 years. So each municipality in the MunID column is repeated for each year, and that year's total is given for that municipality.

What I want to do is run a query that has the years going across the top, and the municipalities down the left side, with the total for a municipality in a certain year in a cell met by the respective row and column. Apprently this calls for a crosstab query. I know I could just loop through each region in ASP, and display the data for each year for the current municipality, but that would get messy. I'd like to try to get this to work in a single query if possible.

The result I'm looking for would be something like this...


Mun Name Mun Type 1976 1977 1978 ...
Mun 1 C ... ... ...
Mun 2 VL ... ... ...
Mun 3 DM ... ... ...
...

So I display the municipality name (only need the ID for column-matching purposes), the type of municapality (through the foreign key to the MunTypes table), and the total value for each year.

I've searched the web, and have found a number of results for creating crosstab queries in Acess, using Access' non-standard SQL. I'd like to avoid using that. I've also found results for SQL Server stored procedures, but that's out too. So it seems like I'm down to using standard SQL.

I've also found some results for crosstab on these forums, but it was mentioned mostly in passing, and I couldn't understand the information given in the other results. I'm not even sure how a crosstab query works...

I have no problem with retrieving the mun type data, or associating a mun name with it's ID...it's just the crosstab part I need help with.

I had thought of using a join to combine the data table with the years from the same table, but then realized that that just joins the rows together, it wouldn't create more columns.

So, I'm hoping someone here has experience with crosstab queries, and can explain them better to me, and perhaps help me with developing a solution.

raf
07-22-2004, 11:47 PM
i suppose this continues
http://www.codingforums.com/showthread.php?t=42018

now, the only problem is getting all number from these 28 years onto one row. There are a few ways to do this (like creating an associative array from th recordset, where you use the year and minicipali as keys, and then using a loop and dynami, but with 150 resulting rows, this might take to long.

sibce tis data will be static (i suppose that the number for the pst years don't change), it might be best to run your crosstab query once (access had a query wizard for crosstabs) and store the result in a new table. Then you can just select this agregated table.

Is (and how is) the current year added?

Grant Palin
07-22-2004, 11:58 PM
i suppose this continues
http://www.codingforums.com/showthread.php?t=42018
I got my question answered in that thread...I thought it would make more sense to start a new thread for this subject...


now, the only problem is getting all number from these 28 years onto one row. There are a few ways to do this (like creating an associative array from th recordset, where you use the year and minicipali as keys, and then using a loop and dynami, but with 150 resulting rows, this might take to long.
I should mention that the end user of this system will be able to select what years they are interested in, which may be one, or a dozen, or all of them. Same for regions. The way I had this planned was to retrieve the selected years and regions submitted from the criteria form, and drop them into a SQL statement. Then execute the query and print the results.


sibce tis data will be static (i suppose that the number for the pst years don't change), it might be best to run your crosstab query once (access had a query wizard for crosstabs) and store the result in a new table. Then you can just select this agregated table.
That idea had occurred to me, but it doesn't seem as elegant as automating the entire process.


Is (and how is) the current year added?
New data is added on a yearly basis...This would be a matter of adding a new year entry for each municipality, along with the appropriate value. Data from previous years is also modified on a yearly basis.

raf
07-23-2004, 12:08 AM
kinda like a small OLAP feature...

is every 'Mun Name - Mun Type - year' combinations filled in in your db?
if yes, then it's rather easy.
in your select, have an "order by name asc, type asc, year asc' (change varialenames of course)
You then just loop through the recordset and if 'Mun Name - Mun Type' are the same as the previous record, then you just add a new cell and put the number in it.
if they are different, then you close the row, open a new one ad print the name and type and the first number.

Grant Palin
07-23-2004, 01:23 AM
Doh!!!

After all my reading and confusuion, that was all I needed to do?!?! :eek:

Yes, I had the data all in place. I did like you suggested, and set up a little logic inside my printing loop. I was able to get just one row for each municipality, and an extra column for each year.

And I didn't even need to change my query at all! Just the printing logic!

Thanks for the suggestion, raf! :thumbsup:

raf
07-23-2004, 08:27 AM
you're welcome :thumbsup:

happy coding!

Grant Palin
07-23-2004, 08:00 PM
I've got a related question now.

The query I had works fine when I want to order results by municipality name (which is how I had it before). But I also want to be able to sort by municipality size; more specifically, the size of the municipality in the last selected year (the user can select the years they are interested in). So if the user selected the years 2000, 2001, 2002, and 2003, I only want to sort the totals column where the current year is the last year chosen (in this case, 2003). I know you can sort on the column itself (eg: "ORDER BY total"), but I have no idea how to sort on that column based on a certain condition.

Any suggestions?

EDIT: Part of the problem is, ironically, to do with the solution you suggested. When I run my query, I just select the data like ina normal query, and do a little manipulation in ASP to print the results the way I want. This is fine when i order the results by municapality name.

But when I order by size, the results are no longer ordered by name, and so the results table is taller than it needs to be. This is because when I order by size, the municipalities of the same name may no longer be next to each other in the results (as they were when ordering by name), so a row for the same municipality is printed several times.

raf
07-23-2004, 10:00 PM
But I also want to be able to sort by municipality size; more specifically, the size of the municipality in the last selected year (the user can select the years they are interested in). So if the user selected the years 2000, 2001, 2002, and 2003, I only want to sort the totals column where the current year is the last year chosen (in this case, 2003). I know you can sort on the column itself (eg: "ORDER BY total"), but I have no idea how to sort on that column based on a certain condition.
I don't think, well, i'm quite sure, that it can NOT be done with pure SQL in one query.

There is no way that you would be able to order on a part of the records (the records with numbers for the last year, and then have all related records somehow have next or so.)

The only sollution i see is :
- create a temporarely table with the records of the last year, ordered desc on the total. include the name inthere and add an autonum to that table.
- have a join between this table and your original table (on the name --> n-1 relation) and order on the autonum (doenst' need to be returned)
- the ASP processing is then the same as you currently have

Grant Palin
07-23-2004, 10:46 PM
I don't think, well, i'm quite sure, that it can NOT be done with pure SQL in one query.

There is no way that you would be able to order on a part of the records (the records with numbers for the last year, and then have all related records somehow have next or so.)
Darn it! :(


-create a temporarely table with the records of the last year, ordered desc on the total. include the name inthere and add an autonum to that table.
Would that entail creating the table beforehand in Access? That's what it sounds like...

One problem I can see with that, if the table was to be created in Access, is that the last year is not fixed...Remember, this is for a querying system, and the user can select whichever years they want. The last year could be anywhere between 1976 and 2003, inclusive. If I understand what you are suggesting correctly (maybe not?), creating thast table beforehand may not work so well if it's intended for a specific last year (e.g. 2003.)


-have a join between this table and your original table (on the name --> n-1 relation) and order on the autonum (doenst' need to be returned)
- the ASP processing is then the same as you currently have
I see. The totals are ordered in the first table, which is then joined to the existing data table by the mun id. Then just order the result by the autonumber in the first table. And the result will hold the municipalities ordered by size, hopefully with no disruption of the data. Makes sense.

The only potential problem I see is the one I mentioned above - having to create the table in Access beforehand. That's not so elegant, and might or might not work well.

raf
07-23-2004, 11:39 PM
Would that entail creating the table beforehand in Access? That's what it sounds like....
No. You can create the table on the fly. Check out the MsAccess helpfunction (the SQL section about DDL) if you'r not familiar with the create table syntax.

will be something simple like

CREATE TABLE temp (ordervar COUNTER constraint pk_temp primary key, nameFK varchar)

This way you can also give the table a sessionspecific name (so that it is only used for your current user at that time --> "temp" + session.sessionID)

But even if it was a premade table, then there is no problem, since you'll alays just realy need 2 column : the autonumcolumn (required for the sorting afterwards) and the name-column (require for the joining). The last year's totals are only used to sort on when you insert the records in this table. But they shouldn't even be inthere.

The downside of werking with a premad table is that you'll needto put a lock on it during the complete proces (which will have an impact on serverperformance if there's a lott of trafic)

Grant Palin
07-24-2004, 12:35 AM
No. You can create the table on the fly. Check out the MsAccess helpfunction (the SQL section about DDL) if you'r not familiar with the create table syntax.

will be something simple like

CREATE TABLE temp (ordervar COUNTER constraint pk_temp primary key, nameFK varchar)

This way you can also give the table a sessionspecific name (so that it is only used for your current user at that time --> "temp" + session.sessionID)

But even if it was a premade table, then there is no problem, since you'll alays just realy need 2 column : the autonumcolumn (required for the sorting afterwards) and the name-column (require for the joining). The last year's totals are only used to sort on when you insert the records in this table. But they shouldn't even be inthere.

The downside of werking with a premad table is that you'll needto put a lock on it during the complete proces (which will have an impact on serverperformance if there's a lott of trafic)

I'm with you. No problem with creating the table. Would I populate it by running a SELECT INTO... statement on the temp table afterwards, with the given data being a query?

raf
07-24-2004, 12:56 AM
I'm with you. No problem with creating the table. Would I populate it by running a SELECT INTO... statement on the temp table afterwards, with the given data being a query?
I would use an INSERT INTO

Just
INSERT INTO temptable (namevar) SELECT namevar FROM originaltable WHERE yearvar=2003 ORDER BY totalvar;
change column and tablenames + the '2003' will be filled in dynamically

Wirh a SELECT INTO you could directly create the table and insert the records, but i do't see how you can then ad the autonum columns to it (which you need for the sorting in your second query)

Grant Palin
07-24-2004, 01:38 AM
I would use an INSERT INTO

INSERT INTO temptable (namevar) SELECT namevar FROM originaltable WHERE yearvar=2003 ORDER BY totalvar;
change column and tablenames + the '2003' will be filled in dynamically

that's what I was thinking of...just got the names wrong...lol :p

Well, I've got partial success...The municipalities are now ordered by size, from largest to smallest, and the municipalities are being kept together by name for the most part, but not always (some municipalities are still being seperated by others, so extra rows appear).

Where the data is coming up correctly, it's sometimes being repeated across the table in extra columns.

I'm not sure what to do about this, but it's time to go home...I'll pick this up again on Monday.

Grant Palin
07-26-2004, 11:11 PM
Well, I've fixed one problem, but still have the other one.

The problem I fixed was the fact that extra columns werre appearing for each year when I selected multiple years, witht the number of columns growing exponentially! The number of year columns was equal to the square of the number of selected years! For example, if I selected 1 year, only one year column would appear. Fine. But if I selected 2 years, I would get 4 columns! And 9 columns for 3 selected years, etc...All the way up to 28 years, which resulted in a VERY wide page!

I believe this was because when joining the tables (the existing data table and the temp one), I was getting the extra columns because of not completely matching columns (ie the year columns). So what I did was to modify the temp table create statement to include a column for years, and included the years column when I filled the table dynamically. Then I just added an extra section to my where clause, matching the year column in my temp table to the year column in my data table. And that problem was fixed! No more extra columns!

I'd had that problem with databases before, getting extra results because I wasn't being specific enough about what I wanted. I believe this was the problem here as well.


The other problem is that municipalities are being seperated when ordering by size. When I order by size, I LITERALLY order by the size of the municipalities involved. So two municipalities may have sizes very close to other, to the point where the sizes overlap. When this happens, I get extra rows for those municipalitioes, because the query is ordering municipalities by size.

Perhaps an example would help...
(the first column is for the municipality name, the second is for 2002, and the third is for 2003)


Kamloops 80,749 80,416
Victoria 76,987 (empty)
Nanaimo 76,736 (empty)
Victoria 76,387 (empty)
Nanaimo 76,344 (empty)
Prince George 75,609 74,849

See what I mean? Victoria and Nanaimo have overlapping size for the years involved, so they are ordered by size. But Kamloops and Prince George don't have that problem, because their sizes are fairly different. So they are kept together, but others are not.

This is not a problem for one year, because there's only one year column. But the problem becomes apparent when multiple years are involved, since there is more chance for disruption.

Anyway, that's the gist of the problem now. I'm working on analyzing the way I'm doing the querying, and hopefully I'll be able to determine a way to solve the problem.

Any thoughts?

raf
07-26-2004, 11:17 PM
post the query you use to join the original table with the temporarely one.

Grant Palin
07-26-2004, 11:33 PM
SELECT
mun.munid AS munid,
mun.munname AS munname,
types.muntypeabbrev AS muntype,
mundata.yr,
mundata.total AS total,
temp.ordervar
FROM
municipalities mun,
muntypes AS types,
mundata, temp
WHERE
mundata.munid = mun.munid AND
mun.muntypeid = types.muntypeid AND
((mun.munid = 35010) OR (mun.munid = 21007) OR (mun.munid = 53023) OR (mun.munid = 17034)) AND
((mundata.yr = 2002) OR (mundata.yr = 2003)) AND
temp.yr = mundata.yr AND
temp.munid = mundata.munid AND
temp.munid = mun.munid
ORDER BY
temp.ordervar DESC

(formatted so it's easier to read. I prefer to read it this way than in a long line!)

I get no errors with this, but not quite the right output either :/

The query generates the same results as in the example I gave before.

raf
07-26-2004, 11:54 PM
Qhould be something like

SELECT
municipalities.munid AS munid,
municipalities.munname AS munname,
muntypes.muntypeabbrev AS muntype,
mundata.yr,
mundata.total AS total,
FROM
((mundata INNER JOIN temp ON mundata.munid = temp.munid) INNER JOIN municipalities ON mundata.munid = municipalities.munid) INNER JOIN muntypes ON municipalities.muntypeid = muntypes.muntypeid
WHERE
municipalities.munid In (35010, 21007, 53023, 17034) AND
mundata.yr In (2002, 2003)
ORDER BY temp.ordervar DESC


and temp should only contain ordervar and munid. When inserting the records, you'r gonna order on a total for a year or so, but that shouldn' be stored in the temp

Grant Palin
07-27-2004, 12:14 AM
Okay, I tried that. I didn't think it was necessary to explicitly join like that, since I was selecting from multiple tables, but it seems to have made a difference...

Here is what I get now...notice the extra columns and rows (and repeated data).


Kelowna 101,708 103,421 101,708 103,421
Victoria 76,987 76,387
Nanaimo 76,344 76,736
Victoria 76,987 76,387
Nanaimo 76,344 76,736
Prince George 74,849 75,609 74,849 75,609

raf
07-27-2004, 12:57 AM
If you only have the ordervar and munid in your temp-table, then you can never get that result.

Just to make sure: inside the temp-table --> each munid should only appear here once. else there is something wrong with the query to populate this table

if that is OK. then try the query without the 2 descriptive table --> just the temp and mundata.

else post the whole thing:
- new db-graph
- query to create the temp-table
- query to populate it
- query to join the tables

Grant Palin
07-27-2004, 01:40 AM
I had just the munid and total columns in the temp table...but each selected munid appeared twice in the table. I tried selecting just Kelowna, and it's id was the only one in the table, but it appeared twice.

Here's the sub to create and populate the temp table:


Sub CreateTempTable(strRegions, strYears)
Dim strCreateTempTable, strDeleteTempTable, strPopulateTempTable

strDeleteTempTable = "DROP TABLE temp"

strCreateTempTable = "CREATE TABLE temp (ordervar COUNTER constraint pk_temp primary key, MunID NUMBER)"

Dim strInsertQuery
strInsertQuery = "INSERT INTO temp (MunID) SELECT munid FROM mundata WHERE munid IN(" & Join(strRegions, ",") & ") AND yr IN(" & Join(strYears, ",") & ") ORDER BY total"

Dim objConn, objCommand

Set objConn = Server.CreateObject("ADODB.Connection")
Set objCommand = Server.CreateObject("ADODB.Command")

objConn.Open strConnect

objCommand.ActiveConnection = objConn

objCommand.CommandText = strDeleteTempTable
objCommand.Execute

objCommand.CommandText = strCreateTempTable
objCommand.Execute

objCommand.CommandText = strInsertQuery
objCommand.Execute

Set objCommand = Nothing
objConn.Close
Set objConn = Nothing
End Sub


The query for results:


SELECT
municipalities.munid AS munid,
municipalities.munname AS munname,
muntypes.muntypeabbrev AS muntype,
mundata.yr,
mundata.total AS total
FROM
((mundata INNER JOIN temp ON mundata.munid = temp.munid) INNER JOIN municipalities ON mundata.munid = municipalities.munid) INNER JOIN muntypes ON municipalities.muntypeid = muntypes.muntypeid
WHERE
municipalities.munid IN (35010) AND
mundata.yr IN(2002,2003)
ORDER BY
temp.ordervar DESC,
mundata.yr


When I run that, with Kelowna and 2002 and 2003 selected, I get the following output:


Municipality 2002 2003 (empty) (empty)
Kelowna 101,708 103,421 101,708 103,421

Again, note the repeated data.

raf
07-27-2004, 08:33 AM
The problem originates here

strInsertQuery = "INSERT INTO temp (MunID) SELECT munid FROM mundata WHERE munid IN(" & Join(strRegions, ",") & ") AND yr IN(" & Join(strYears, ",") & ") ORDER BY total"

Why isn't it simply

strInsertQuery = "INSERT INTO temp (MunID) SELECT munid FROM mundata WHERE yr = strYears ORDER BY total"


where strYears is then the most recent of the selected years?

Grant Palin
07-27-2004, 07:47 PM
I must have forgotten to take my brain out of it's jar on the day I wrote that code!:p

I eliminated the criteria for the region. I had put that in there since I thought it would be better to limit the number of regions added to the temp table to those that were selected. Go figure.

I also removed that criteria for all those years, and added the criteria for the last year in the selected range. I actually had to get the value at the upper bound of that variable, since that strYears variable is an array of selected years.

So my insert query was more like:


strInsertQuery = "INSERT INTO temp (MunID) SELECT munid FROM mundata WHERE yr = " & strYears(UBound(strYears)) & " ORDER BY total"


After all that, I crossed my fingers and ran the query. And it worked. After all that trouble, that's all I had to do! :eek:

So now the municipalities are being kept together, and sorted by size. So I get at most 157 rows in the resulting table (which is the number of municipalities).

So I've got my order-by-size query working. I had to go back and fix the code a little bit for my order-by-name query (what we talked about before), since the queries are being built in the same function; the query returned depends on variables based on the user-selected options. Because I had made changes to variables to get the query working for size, I had to make similar changes to the code that generates the name query. But it's working now!

Now onto order-by-regional district! :p

Thanks again for all your help and patience! :thumbsup:

raf
07-27-2004, 08:44 PM
You're welcome. Glad you got it running :thumbsup:

needhelp26
08-02-2004, 03:46 AM
Raf and Grant palin,

Sorry to interrupt, but I have question,

"(access had a query wizard for crosstabs) and store the result in a new table. Then you can just select this agregated table."

The above statement I took from Raf's answer to Grant palin. After Crosstabs query, to store the result as new table, have to use the "maketable query", but my question is once we connect to aggregate table after using the maketable query, will the data get updated as and when the original table is getting updated?

Again sorry to poke my nose in, since I have a similar project to do, I am curious.

raf
08-02-2004, 09:40 AM
Note sure i understand the question completely + my intensive-access-using-days are way in the past.

but:
- normally, in MsAccess, a maketable query will automatically first drop an existing table with that name.
- if you need a datastructure that gets automatically updated each time you call it, then you need to use a view.
A view is basically a table without data. It only contains the datadefinition and will create a temporarely datacollection when the view is used. Don't know for sure if the crostabwizard can create views but i suppose it can.
Incremental refreshes are not supported in MsAccess (as far as i know). Only used these in Oracle.
- for Grant Palins situation, the question is irrelevant since he has a "drop table" statement in his script + each temporarely table might contain different data
<edit>It is also important to note hat the crosstab-query returns agregated data of your underlying tables. If there is frequent datamanipulation on the underlying tables, then using a crosstab is not a good option (read the argument about storing agregated data here and in the initial thread).
But since it's real historical, fixed, data, there is no real argument against creating this sort of agregated results</edit>

needhelp26
08-03-2004, 12:17 AM
Thankyou for explaining. I am not sure of the views in MS Access, have to check them out.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum