Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 10-19-2012, 03:16 PM   PM User | #1
Stooshie
Regular Coder

 
Stooshie's Avatar
 
Join Date: Mar 2008
Location: Dundee, Scotland
Posts: 376
Thanks: 9
Thanked 39 Times in 39 Posts
Stooshie is on a distinguished road
Question Many REAL columns with 100,000 rows

Hi there,

I am working on a system that will be analysing medical trial data. One table needs multiple columns(maybe up to 50) of type REAL and will have 10s of 1000s if not 100s of 1000s of rows.

I know I can index real columns and I may need to. Is this going to make the DB vastly huge and unworkable for such large datasets?

Regards,

Andrew.
__________________
Regards, Stooshie
O
Stooshie is offline   Reply With Quote
Old 10-19-2012, 07:39 PM   PM User | #2
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,217
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
50 type reals (at 8 bytes each) is 400 bytes.

100,000 records at 400 bytes each is 40 MB.

The largest MySQL table I've created (so far!) was ONE THOUSAND TIMES LARGER at 40 GIGABYTES. And, with proper indexes, I could get anything I needed from that table in about 100 milliseconds.

Pardon me for laughing, but why would you possibly be worried abut such a very tiny database?
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 10-23-2012, 01:13 PM   PM User | #3
Stooshie
Regular Coder

 
Stooshie's Avatar
 
Join Date: Mar 2008
Location: Dundee, Scotland
Posts: 376
Thanks: 9
Thanked 39 Times in 39 Posts
Stooshie is on a distinguished road
It's not the size of the DB itself I am worried about. It's the indexing of 50 or so REAL or DECIMAL columns in a DB with 100,000 rows that is making me wonder about the size issue.
__________________
Regards, Stooshie
O
Stooshie is offline   Reply With Quote
Old 10-23-2012, 08:08 PM   PM User | #4
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,217
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
You need to index *ALL* the columns? WOW. I have to wonder if that's the best design then. Maybe an associated table, instead?

Example:
Code:
CREATE maintable (
    id INT AUTO_INCREMENT PRIMARY KEY,
    ... various fields but not the REAL data ...
);

CREATE realdatatable (
    id INT REFERENCES maintable(id),
    flag INT, /* describes the type of the data
    data REAL,
    INDEX masterindex (flag, data)
);
Now you have only the one index. Granted, it's a big index, but that's not an issue for MySQL. So your queries would do something like
Code:
SELECT list, of, fields 
FROM maintable AS M, realdatattable AS R
WHERE M.id = R.id
AND R.flag = 33
AND R.data BETWEEN 3.771 AND 4.032
Just a stab at a maybe solution. Dunno if it fits your model or not, at all.

********

Having said all the above: Though 50 indexes seems excessive, MySQL should be able to handle them just fine. If you use EXPLAIN and find it using the wrong index in some cases, you can use "hints" in the SELECT (usually) to force it to use the right one(s).
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 10-25-2012, 01:19 AM   PM User | #5
Stooshie
Regular Coder

 
Stooshie's Avatar
 
Join Date: Mar 2008
Location: Dundee, Scotland
Posts: 376
Thanks: 9
Thanked 39 Times in 39 Posts
Stooshie is on a distinguished road
Thanks for that.

It's a DB of medical research data so each column of data needs searched on for CHI Squared tests etc...

Andrew.
__________________
Regards, Stooshie
O
Stooshie is offline   Reply With Quote
Old 10-25-2012, 03:24 AM   PM User | #6
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,217
Thanks: 59
Thanked 3,996 Times in 3,965 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
Quote:
Originally Posted by Stooshie View Post
It's a DB of medical research data so each column of data needs searched on for CHI Squared tests etc...
Yes, but nobody said you have to *STORE* the data in 50 columns in a single table, did they?

*DO* consider the idea of "tagged" data, instead.

Not saying you have to end up choosing that scheme, but there are advantages to it.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Reply

Bookmarks

Tags
datasets, index, large, real

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:23 PM.


Advertisement
Log in to turn off these ads.