View Full Version : SQL - multiple joins
Spudhead
09-03-2003, 12:36 PM
Tables:
teams - id, teamname
players - id, playername
team_players - team_id, player_id
player_scores - id, player_id, player_score
SELECT
teams.teamname,
players.playername
FROM teams
INNER JOIN team_players ON team_players.team_id=teams.id
INNER JOIN players ON team_players.player_id=players.id
WHERE teams.id=19
Access 97 gives me a "Syntax error (missing operator)" and I'm not sure why.
Once I've had it explained patiently for the umpteenth time how to do multiple joins, I'd like to expand the query above so it includes the sum of the player scores; so I get a table returned with a row for each player; a column for the team name, a column for the player name and a column that sums all entries in the player_scores table for that player id.
And I'm buggered if I can work out how to do it. Anyone? Please? :)
The query should be
SELECT teams.teamname, players.playername, Sum(player_scores.player_score) AS totscore
FROM (players INNER JOIN player_scores ON players.id = player_scores.player_id) INNER JOIN (teams INNER JOIN team_players ON teams.id = team_players.team_id) ON players.id = team_players.player_id where team_players.team_id = 19 GROUP BY teams.teamname, players.playername, player_scores.player_id
But your db-design should better be changed. Because the "total player scores" is a players attribute, so they should be a variable in the players table, unless you have a 'game' variable inside that player_scores table. But even then. If you need this total score frequently, you best have it as a variable in the players table, and update that variable each time that player gets a new record in the player_scores table.
Spudhead
09-05-2003, 10:21 AM
Aha - I think I understand the syntax, thanks.
If you need this total score frequently, you best have it as a variable in the players table, and update that variable each time that player gets a new record in the player_scores table.
Thats what I've done. A player_scores table seemed the best schema - it doesn't carry any sort of game id, date value, but it could if required.
I still don't think that the player total score belongs, technically, in the players table; it's an aggregate function on the player_scores table, there's no point in storing the data twice. But it is handy to have in there :)
All I have to worry about now is entering the player scores; every week when the new scores go in, I'm firing off about 350 insert statements....
:thumbsup:
I still don't think that the player total score belongs, technically, in the players table; it's an aggregate function on the player_scores table, there's no point in storing the data twice. But it is handy to have in there
You will change your mind in time. I'm pretty sure of that.
You probably learned some db-design and they teached you that you should not have any data-redundancy, in your db. From that point of view, you're rigth.
But db-design and a perfectly normalized db isn't a goal in itself. There are other, technical, rules you need to observe when designing a db (luckyly for db-admins with a fat paycheck, cause otherwise everyone just needed to learn a few rules of thumb to do it themselves ...)
I tried to explain it in this thread :
http://www.codingforums.com/showthread.php?s=&threadid=17608&highlight=compute
It comes down to :
- identify which query's need to be ran frequently, are critical and should be fast
- identify the query's that need to be ran periodicaly, and don't need to be fast
- then build your db-design so that it is optimized for your identified an prioritised data-processen
Just think of it. Say you only add new score weekly --> this means 1 field to update each week for players that scored. These inserts and updates could be ran at nigth, when trafic is low. Tablelocking will be minimal and wount noticably disturb other data-querys.
But the values (total scores) will be printed much more frequently, daily, at peakmoments etc. So you migth waste a nanosecond weekly, but you will save seconds, daily, when you need to pull the totscores from the db.
In time, the scorestable will get big and will contain a lott of data from past seasons.
What would be best? Having another extra variable in the playerstable with his 'overall career score' that you weekly update like the 'this seasons score' + archive the records (when the new season starts) in another table OR maintaining this player_scores table and filter on the date to get the records from this season + sum() them and get the records from pastand current seasons and sum on that ?
I don't know if you actually need this sort of functionality and if you are interested in archiving (or if you just delete the scores table at the start of the new season), but i just try to ilustrate that data-redundancy isn't always the main issue when you design a db + that completely normalizing the db sometimes restricts your flexability and db-performance.
whammy
09-06-2003, 01:50 AM
raf has a good point - and if you're dealing with a LOT of records, then SQL Server is much better to use.
I gleaned a lot of useful stuff from taking the "2073" class from Microsoft - "Programming SQL Server", regarding database optimization, especially indexing etc..
One thing with denormalization that raf is probably referring to is regarding SQL Server's ability to create a non-clustered compound index on a table.
This basically acts like a clustered index - everything is stored in the leaf level of the balance tree, and therefore runs much faster since SQL Server doesn't have to go to the "heap" (since we're referring to "trees" you could say "dig around in the dirt") to get data. I believe this is what raf is referring to when he references commonly run queries.
I don't understand it absolutely myself (and neither did my instructor, if you want to get technical), however I do understand it conceptually, and have put indexing to good use!
Sorry if that's confusing, but it really does help to know!
I wasn't exactly thinking about any specific db-format.
It's just the concept or principle i wanted to emphasize.
The thing s i wrote will go for Ms Access as for any db-format. If you use bigger db's like Ms Server or MySQL, Oracle, DB2, ..., then you'll see they all have special tricks for this sort of thing.
For instance, MYSQL has a query-buffer feacher, which means it can keep the resultset of a query in it's buffer, so that he can instantly return the resultset when the same query is asked again.
In Spudheads app, you could for instance have a "top scores"-list. Now, this list wount change between two play-days, so you could buffer the resultset for your top-scores-query.
The more perceptive mind will now think :" Yeah, but with that feature, i don't need the redundancy of the agregates score in the players table, since i can run the three table join query once a week and buffer it." And (s)he would be rigth ! Unless ... you also need the total score per player for other non buffered-querys etc.
You see, there no "one rule fits all" guidelines. It's a matter of analysing, prioritising, knowing what your db has for optimalisation tricks OR choosing the right db-format for your specific needs.
I only wanted to point out that there are other things to concider, then "avoid dataredundancy". The simple fact that indexes exist, is a good illustration that datareduncancy is sometimes better. From data point of view, an index is nothing else then a 'redundant' reference-list. But from performance point of view, it's your best friend.
whammy
09-07-2003, 01:04 AM
Exactly... at times there are reasons you should denormalize your database, for performance. And this has to do with commonly run queries, indexing, database structure, etc.
From your description of how MySQL works, it must be pretty similar to SQL Server in some regards (specifically caching queries, etc.). But that's not surprising. :)
However indexing is not directly correlated with data redunancy in the manner you're talking about, as far as I know.
You can still do a multiple join on a couple of tables and as long as the database engine is taking advantage of indexes on BOTH tables or whatever, then it's still going to run pretty fast...
Of course, noone is right or wrong here - with databases you have to follow one rule:
"It depends"
:D
vBulletin® v3.8.2, Copyright ©2000-2012, Jelsoft Enterprises Ltd.