PDA

View Full Version : Correct way to store this


Michiel
02-14-2008, 11:52 AM
Hi,

I have a question about how to store my data in a smart way. Let me explain my situation:

I want to be able to create news messages for a website I'm building. A newsmessage will contain general information, like title, message, date etc. and I'm currently storing that as follows:

id#title#message#date

Now I also want to be able to add a category to the news message, in the form of an id number. I planned to simply store this id-number in a seperate field:

id#title#message#date#category

However, I've come to the conclusion that several newsitems can belong to different categories. So I figured to store them as follows: "1,5,6,7,9,67". But then I run into problems with fetching the appropriate newsitems for each category .. Up untill now I have about 500 different categories. And it is likely that the number of categories will expand on a regular basis.

People already pointed me out that this way of storing the data is not smart and is likely to cause problems later on .. but up until now it isn't clear what is the better route to follow.

Can anybody share his/her thoughts on this?

Any help would be greatly appriciated!

Cheers, Michiel

nikos101
02-14-2008, 11:59 AM
You should definitely create a separate table that joins your articles with another table that stores the different types of article. this allows you to have your articles assigned to many Different types.



CREATE TABLE `articleDefs` (
`ArticleID` smallint(5) unsigned NOT NULL,
`TypeID` smallint(5) unsigned NOT NULL,
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

You can then use join statements to retrieve your metadata as you wish.
I hope this helps

Michiel
02-14-2008, 01:25 PM
Ok .. And then I would end up with something like this for example:

ArticleID#TypeID
2#1
2#5
2#6
2#7
2#9
2#67
3#1
3#8
etc.

Excuse me for being novice .. but what would my select statements look like? To get correct news articles based on the category (TypeID) and to get all the categories for a specific article?

Thanks very much so far!

Michiel

nikos101
02-14-2008, 01:35 PM
The easiest is the natural join

$query="SELECT * FROM articles NATURAL JOIN articleDefs NATURAL JOIN articleTypes where typeID= 'your_ID'";


This gives you the flexibility you need. However you are not just Limited to articles types, you could create any other type of metadata table to describe your articles!

nikos101
02-14-2008, 01:39 PM
or this which would be better semantically

$query="SELECT * FROM articles NATURAL JOIN articleDefs NATURAL JOIN articleTypes where articletype = 'your_type'";

which would save you having to remember ids

nikos101
02-14-2008, 01:43 PM
for all categories for an article use

$query="SELECT * FROM articles NATURAL JOIN articleDefs NATURAL JOIN articleTypes where articleID= 'your_ID'";

Michiel
02-14-2008, 01:52 PM
Ok cool! I think I'm starting to understand this method of working. Indeed it seems as a much smarter way of doing this!

Since we're at the topic, I'm also thinking of adding relationships between the categories. So for example category 1 is related to category 2, 6, 10, 25 etc. I think a similar approach would be appropriate as well. The only difference with the news articles, is that there is a reverse relationship as well. So when I lookup category 6 I want to know that it is related to category 1 as well.

What is a smart way to store this then, without having tons of redundant data?

Thanks for thinking along with me!

Michiel

nikos101
02-14-2008, 03:40 PM
Yes I would keep with the same idea

Michiel
02-14-2008, 03:54 PM
Ok .. I'm now thinking of setting it up as follows:

category#related_category

but then by default I would be storing the same relationship twice:

category#related_category
1#2
1#3
1#5
2#1
3#1
5#1

This doesn't seem to be a good practice, right? Or is it better to rename the columns to category1 and category2 and store once. That means my SELECT query would then become 'WHERE category1 = $id || category2 = $id' ..

Can you advise me on which setup to use?

nikos101
02-14-2008, 04:05 PM
I see what you mean. Its getting complex now :)

If your category data is not huge say < 60 relationships do it as your doing above and try and avoid duplicating rows

1#2
2#1 eg

but if there is going to be a whole load of relationships your going to find it hard to keep track of in one table, and in that case I would define a separate table for each category that only has one column which has the links to the other categories.

Michiel
02-14-2008, 04:18 PM
Well .. I'll have let's say 500 different categories (which may easily double over time, so already thinking about that). As far as the relations go, I estimate each category will have no more that 10 - 20 related categories. So that would result in a table with a maximum of 500 * 20 = 10,000 rows in the current situation (if I strip out the doubles) and 1000*20 = 20,000 rows over time ..

That wouldn't cause problems, would it?