Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
09-19-2013, 04:46 PM #1
Junction Table and (Physical) Primary Key
This past Summer I came up with a new strategy for dealing with Keys in my tables. And while it seems to be working just fine, I want to backtrack and get an outside opinion on whether what I am doing makes sense.
Specifically, I am interested in my approach and how it relates to Junction Tables.
I have always had mixed feelings about Derived AutoIncrement Primary Keys.
On one hand, they are good because the database ensures you always have a unique primary key.
On the other hand, they tend to be meaningless, and while AutoIncrement ensures your PKs are unique, it does nothing to ensure you don't have this...
ID User --- ----- 1 DoubleDee 2 DoubleDee 3 DoubleDee 4 DoubleDee
As such, I started using this approach to get the best of both worlds.
- Every table has an ID that is AutoIncrement
- The ID field is indexed as Unique
- Now I am guaranteed to get my ID=1, 2, 3, ...
- Then I try and find something "physical" that naturally describes the record and *should* ensure uniqueness (e.g. Username, Email, etc.)
This has served me well on single tables, but I'd like an outside opinion on how it impacts my Junction Tables...
Imagine this scenario...
member - id (UK) - username (PK)Code:
article - id (UK) - slug (PK)Code:
comment - id (UK) - article_id (PK1)(FK) - member_id (PK2)(FK) - created_on (PK3)
In the example above, I chose to use the ID's for linking to be more efficient.
My question is whether I am gaining any great benefit (or causing any undo harm) in my Junction Table by having the ID be a UK, and then creating a Composite PK from the two Foreign Key fields along with the Timestamp??
A lot of people would do this....
comment - id (PK) - article_id (FK) - member_id (FK) - created_on
To me, it is better to create the Composite Key because it (physically) ensures unique records. (Although since this example uses a Timestamp, that is less relevant.)
What do you think?
09-19-2013, 10:24 PM #2
The only real disadvantage of a composite key in MySQL is that MySQL has some severe limitations on when it can use such a key.
In particular, given your composite key of (article_id, member_id, created_on), MySQL would only be able to take advantage of that key (e.g., when one or more of those fields appeared in your WHERE clause) if the conditions in your query included a test for:
article_id and member_id and create_on (in any order, but all three)
article_id and member_id (ditto)
Any queries that used conditions other than those (e.g., a WHERE clause that used only member_id, or one that used article_id and create_on) would *NOT* use the index, so you'd be reduced to a full table scan.
This is well documented in the MySQL docs and is a limitation that may not appear in other databases.
Be yourself. No one else is as qualified.
09-19-2013, 10:38 PM #3
I guess I was under the impression you should always do that on a Junction Table...
(Again, I just get this funny feeling about blindly having a field labeled "id", making it a PK, and calling it quits. To me, an Index/Constraint help ensure you have "good" data in your tables to some degree - as I am trying to do.)
1.) Name: Primary
Fields: article_id + member_id + created_on
2.) Name: index_u_id:
3.) Name: idx_article_id
4.) Name: idx_member_id
If I had queries that needed to look for all Comments for Article 123, or all Comments by DoubleDee, wouldn't Indexed #3 and #4 work?
09-19-2013, 10:41 PM #4
Oh, sure, with all those added indexes you are just fine.
And you can actually DROP the idx_article_id since MySQL could and would use the PK for it, anyway.
Be yourself. No one else is as qualified.
09-19-2013, 11:23 PM #5
Sorry to *mushroom* this thread, BUT here are some issues that have me upset...
What should be the Primary Key of any table?
Should it always be the "ID" field with AutoIncrement set?
Or should it be a "Natural" key if one exists?
When you helped me with my Section/Subsection/Article stuff, I opted for making all of the PK's "natural" for readability during joins.
(I'm not sure I feel so strongly in cases like my Comments table...)
Does a Junction Table need a Composite Key on the Foreign Key fields to prevent dups?
By having a Composite Primary Key on (article_id + member_id + created+on) I am able to prevent this scenario...
(Maybe that is unlikely, but it just seems weak to have the ID be the PK and no other constraints in place to keep your data honest.)Code:id article_id member_id subscribed_on --- ----------- ---------- -------------- 1 5678 38 2013-09-19 2 5678 38 2013-09-19 3 5678 38 2013-09-19 4 5678 38 2013-09-19
Some people have shot down using Unique Indexes because they allow NULLs, and thus are deemed "weak".
My feeling was that if every table starts off with an "ID" field set to AutoIncrement and Null = No, then my UNIQUE Index could never be NULL and so it should serve as a good "pseudo" Primary Key.
As time goes on, I am flipping more on what should be the Primary Key: Surrogate ID field vs. Some Natural Field
I think that is what has me most confused and upset.
For my Section/SubSection/Article thingy, I think making all of the "slug" fields (e.g. Section Slug = "Finance") the Primary Key was totally the right decision.
But do I really want a Comment table which has a Composite Primary Key like this...
Code:"postage-meters-can-save-you-time-and-money" + "DoubleDee" + "2013-09-19 15:29"
Maybe just switching from this...
Code:comment - id (UK) - article_id (PK1)(FK) - member_id (PK2)(FK) - subscribed_on (PK3)
To this would be enough...
Code:comment - id (PK) - article_id (UK1)(FK) - member_id (UK2)(FK) - subscribed_on (UK3)
Regardless, I still think this way is a bad idea...
Code:comment - id (PK) - article_id (FK) - member_id (FK) - subscribed_on
Code:- article_id (PK1)(FK) - member_id (PK2)(FK) - subscribed_on (PK3)
09-19-2013, 11:30 PM #6
#2: Yes, of course, if you indeed don't want to allow duplicates.
#3: Silly to worry about NULLs in unique keys if you simply disallow NULL in your fields, which in that above case you clearly will be doing (for all fields, presumably).
Have to leave for the day. Will think on the others.
Be yourself. No one else is as qualified.
09-19-2013, 11:44 PM #7
- Join Date
- Sep 2005
- Sydney, Australia
- Thanked 693 Times in 682 Posts
Any table that contains a derived key as well as a natural key is not fully normalised. Normalising it will get rid of the derived key.
While there are sometimes reasons for using a derived key in addition to a natural one (eg the natural key is very large) but the decision to undo that part of the normalisation should be done after the logical database design is completed when applying those changes to the logical design needed to make your physical design practical. Where the logical design can be converted to a practical physical design without undoing this normalisation then the derived key should not be added. Everywhere that your physical design differs from a fully normalised logical design you should have a specific reason for that change.
09-20-2013, 12:01 AM #8
It is also entirely possible to use *just* Natural Keys for my Primary and Foreign Keys.
So that meets what you are saying about the "Logical Model Design Phase".
I am now in implementation, and I have chosen to add an "ID" field that is AutoIncrement, Null = No, and has a Unique Index to these tables because having such a field in each table...
1.) Makes it easier to identify a particular record. (Why try and find an Article Slug that is a VARCHAR(100) when I can simply look for ArticleID=3975?!)
2.) It makes joins much easier.
3.) As things grow, it *may* have performance implications, although that is yet to be seen.
4.) There is virtually *no overhead* in having a ubiquitous "ID" field with AutoIncrement and a Unique Index, so why not?!
BTW, thanks for the thoughts, but you clearly must be from the "Natural Key Camp", because there are millions of DBA's out there that would vehemently disagree.
They would say that you should always have an AutoIncrement ID field on every table - in addition to the Natural Keys. Further, they would insist that you *always* use a Surrogate Key to join tables!!
To me, that is as extreme as saying to never use Surrogate Keys - which I am not implying you said.
For me, I want to balance these two extremes, and ideally, have it both ways!