View Full Version : Foreign Key Optimization
StupidRalph
02-17-2008, 12:48 AM
I have a table for keywords. Right now I have it set up with "id, keyword" columns id obviously being the primary key (pk). Since the keywords are going to be unique anyway should I:
just ditch the `id` field and use the `keyword` field as the pk?
use a composite primary key?
leave the table as is with id as the pk?
I realize it may not be that big of a difference but I was wondering what the advantages/disadvantages would be when it comes to adding foreign keys (fk).
On one hand I'd have to use the whole keyword (or phrase) as the fk but will bloat the table using it since there are a few phrases. On the other hand, I may have 250,000 or more keywords so I'd have to at least use mediumInt which is 3 bytes. On my third hand...yes third :) I don't know about the primary composite key other than adding to both indexes every time but its read/write ratio would be no more than 80/20.
I don't expect drastic results just wondering others opinions.
Fumigator
02-17-2008, 06:39 AM
You're actually tapping into a hotly debated database design topic: Natural key vs. Surrogate key. Google the two terms together and you'll get an idea of what the debate is all about.
Some say you should always construct the primary key from fields that exist in the table, no matter what, even if you have to glue several columns together to make a unique key. These "naturalists" claim that surrogate (i.e. sequential unique ID) keys shouldn't be used because the value doesn't logically relate to the data in the row. They claim that surrogates make the design non-relational, and that it makes it possible to insert duplicate rows, because it would allow you to insert a "Mary Jane" with SSN "999-99-9999" in twice, where if your key were made up of last name + first name + SSN it would be impossible to insert that value twice.
The "surrogatists" point out the advantages of using a surrogate key. It never changes, therefore you don't have to worry about updating foreign key values in dozens of tables when, for example, someone's last name or SSN changes. A surrogate key is also easy to use as a foreign key. Imagine a natural primary key made up of last name + first name + date of birth + place of birth (and that still isn't unique), then having to carry those four columns as a forgein key in several other tables. (UGH)
For me it boils down to a personal choice, and keeping the database functional. I prefer surrogate keys in just about all situations, but you've presented a situation where I think using a natural key actually makes sense. It's a single column, so foreign key values in other tables would be straight-forward, and as an added bonus, your foreign key contains the keyword value so you don't have to join with the keyword table to get that value. Handy.
I've gone on long enough... my vote is use the keyword column for a primary key. The only issue is when a keyword value is modified you'll have to update foreign keys in other tables but if you can deal with that, then I think it's the better option.
I found that to be very useful Fumigator.
As you (probably) know, I am new to MySQL and already I have been wondering about this. Currently my queries check to see if the values I want to add to a table are already there, so as not to allow duplicates.
Say for example, I want to add a record to a table. I check (usually) three of the values-for-entering against the appropriate field values already in the table. if there is a match the script errors, and says that I have just tried to make a duplicate record. So accidentally, I think I was doing something similar but perhaps less efficiently.
great post
bazz
StupidRalph
02-17-2008, 07:46 PM
I'm actually reading up on the debate on this. It seems to be a case based decision. For the most part a lot of experts recommend using a surrogate key but do concede that using a natural key might be best for some lookup tables.
Currently my queries check to see if the values I want to add to a table are already there, so as not to allow duplicates.
MySQL has the keyword IGNORE that could be used for rows that are duplicate.
http://dev.mysql.com/doc/mysql/en/insert.html
vBulletin® v3.8.2, Copyright ©2000-2009, Jelsoft Enterprises Ltd.