...

View Full Version : Database Design



masterz2k
06-17-2011, 09:07 AM
Hi All,

Newbie coder here, but looking for advice on a project that im currently playing with.

Im after your thoughts on SQL Table Design.

I have aprox 2 Million Rows which has potential to increase to about 5 Million.

Feilds i need to store are.

ID - 7 Character ID
Long ID - Up to 20 Character long ID

now the part im stuck on.

Codes - this will contain alot of 3 digit codes, which will all be seperated by a delimiter. (aprox 200 3 digit codes, like 100;102;102;103 etc.

am I better of storing codes in one long field? (is there a char limit?) or a separate column for each code. is 200 columns?



the other way i thought was to go downwards and just have ID, Long ID, Linked to another table with ID, 3 Digit Code, but this would mean I would have 2million x 200 rows = 400million rows. (and alot of duplication of ID)

Can anyone suggest another way ? and aprox how much storage space this is likely to use?
:)

guelphdad
06-17-2011, 05:30 PM
Don't store multiple values in a single column.

Go with the second idea you have. Don't worry about the number of rows. Properly normalized and indexed you won't worry about performance like you would in the first design.

masterz2k
06-17-2011, 05:47 PM
my only reason for the multipul values in one column was because the value would never need to be searchable...

im literally puling out the values to display on a web page.

i guess the advantage of the second design is the code row can be limited to 3 chars and might help save space!


thanks for your input :)

any idea what sort of space or spec hosting this would require?

bazz
06-18-2011, 03:20 AM
my only reason for the multipul values in one column was because the value would never need to be searchable...

Never is quite a long time. Always design a database by following the rules of normalisation. That way, if your requirements ever change, the db will be best designed to facilitate them.

Out of curiosity; what's with the seven digit ID? do you want it to be a minimum of seven digits or are they all to be seven digits? for IDs, I use INT or one of its derivatives.

masterz2k
06-18-2011, 10:50 AM
Never is quite a long time. Always design a database by following the rules of normalisation. That way, if your requirements ever change, the db will be best designed to facilitate them.

Out of curiosity; what's with the seven digit ID? do you want it to be a minimum of seven digits or are they all to be seven digits? for IDs, I use INT or one of its derivatives.



The 7 Digit ID will be alphanumeric, up to a max of 7 digits;

one of the issues i have is that the 7 digit id could change, however the 20 Char ID will remain the same, so il probably split them into 2 tables with a INT as a link between the 2, that way if i decide to go the long row option, i will only be duplicating the ID, rather than the 20 char ID


one remaining question, am i better off having 200 columns keeping each 3 digit code seperate, or should i go long rows with the ID duplicated down through?

ie, 2 Million Rows, 200 columns wide,

or 400Milion rows 2 of 3 columns wide.

Old Pedant
06-18-2011, 08:57 PM
400 million rows. It would be a NIGHTMARE to have 200 columns in a table.

I do have to say that if you are 100% absolutely sure that you will *NEVER NEVER NEVER* need to search for any values in these 3-digit codes, I might be tempted to say go ahead with the delimited list. It's a truly crappy database design, but there are practical considerations here.

For one think, retrieving 100 rows in your secondary table in order to then display the resultant values as a delimited list on your web site will be a pain in the patootie (though MySQL makes it not *too* hard, if you use GROUP_CONCAT( ) function).

It's kind of a case of good DB design vs. good web usage.

masterz2k
06-20-2011, 02:58 PM
Hi Old Pedant,

Thankyou for the input,

my issue is i'l idealy be running this on as smaller hosting package as poss.

im confident id never want to search by individual code in the future,

also thinking that using my delimited theory, im essentialy needing to query 1 line of data out of 2 million, rather than 200 lines of data out of 400million lines, surely from that point of view it will be quicker/less resource using the delimited option.

i will only ever want to put in 1 id and get the resulting results. never need to pull out a range of anything seperate

i know its not brilliant DB design, but like you say, its DB Design vs praticality for the web project.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum