Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 11-17-2012, 09:10 PM   PM User | #1
KULP
Regular Coder

 
Join Date: Mar 2012
Posts: 166
Thanks: 5
Thanked 11 Times in 11 Posts
KULP is an unknown quantity at this point
How to set up tables. Dynamic content?

My question involves a scenario, so I'll use twitter as the example. We have a table with user id, pass, etc. Now, we have a table with all tweets? Does each user have their own table with tweets or, when a page loads, does the server search through EVERY tweet from EVERY user and match user ids? I'm just confused how else you would find a specific user's tweets than to search through every tweet. Hope this makes sense, currently learning the basics but stuck on this idea.
KULP is offline   Reply With Quote
Old 11-17-2012, 10:02 PM   PM User | #2
felgall
Master Coder

 
felgall's Avatar
 
Join Date: Sep 2005
Location: Sydney, Australia
Posts: 5,530
Thanks: 0
Thanked 503 Times in 494 Posts
felgall is a jewel in the roughfelgall is a jewel in the roughfelgall is a jewel in the rough
Make the userid field a part of the primary key for the table and then they can all go in the same table without having to search through everything.
__________________
Stephen
Learn Modern JavaScript - http://javascriptexample.net/
Helping others to solve their computer problem at http://www.felgall.com/
felgall is offline   Reply With Quote
Old 11-17-2012, 11:39 PM   PM User | #3
KULP
Regular Coder

 
Join Date: Mar 2012
Posts: 166
Thanks: 5
Thanked 11 Times in 11 Posts
KULP is an unknown quantity at this point
So I have userid for primary key in user table, and then I have a table for 'tweets.' Don't I now have to search through every tweet to match up userid's?
KULP is offline   Reply With Quote
Old 11-19-2012, 09:40 PM   PM User | #4
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,556
Thanks: 62
Thanked 4,055 Times in 4,024 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
Quote:
Originally Posted by KULP View Post
So I have userid for primary key in user table, and then I have a table for 'tweets.' Don't I now have to search through every tweet to match up userid's?
Yes and no. *YOU* don't have to "search" at all. MySQL does.

And the trick to making that search efficient is to create an index on the USERID field in the TWEETS table.

Something like this:
Code:
CREATE TABLE tweets (
    tweetid INT AUTO_INCREMENT PRIMARY KEY, /* optional */
    userid INT,
    tweet VARCHAR( 150 ),
    whenTweeted DATETIME,
    CONSTRAINT FOREIGN KEY tweet_users_fk (userid) REFERENCES users(userid),
    INDEX tweet_userid_idx ( userid )
) ENGINE=INNODB;
The CONSTRAINT says that userid is intimately linked to the primary key userid in the users table; the INDEX makes sure that MySQL can get essentially instant access to all tweets, by userid.

So a typical query might then be:
Code:
SELECT T.tweet
FROM users AS U, tweets A T
WHERE U.userid = T.userid
AND U.username = 'John Doe'
ORDER BY whenTweeted DESC
LIMIT 10
to get the most recent 10 tweets by "John Doe".

And it *will* be a very fast query.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Users who have thanked Old Pedant for this post:
KULP (11-27-2012)
Old 11-20-2012, 04:19 PM   PM User | #5
KULP
Regular Coder

 
Join Date: Mar 2012
Posts: 166
Thanks: 5
Thanked 11 Times in 11 Posts
KULP is an unknown quantity at this point
Quote:
Originally Posted by Old Pedant View Post
Yes and no. *YOU* don't have to "search" at all. MySQL does.

And the trick to making that search efficient is to create an index on the USERID field in the TWEETS table.

Something like this:
Code:
CREATE TABLE tweets (
    tweetid INT AUTO_INCREMENT PRIMARY KEY, /* optional */
    userid INT,
    tweet VARCHAR( 150 ),
    whenTweeted DATETIME,
    CONSTRAINT FOREIGN KEY tweet_users_fk (userid) REFERENCES users(userid),
    INDEX tweet_userid_idx ( userid )
) ENGINE=INNODB;
The CONSTRAINT says that userid is intimately linked to the primary key userid in the users table; the INDEX makes sure that MySQL can get essentially instant access to all tweets, by userid.

So a typical query might then be:
Code:
SELECT T.tweet
FROM users AS U, tweets A T
WHERE U.userid = T.userid
AND U.username = 'John Doe'
ORDER BY whenTweeted DESC
LIMIT 10
to get the most recent 10 tweets by "John Doe".

And it *will* be a very fast query.
That's what I was thinking. Now assume I have 20,000 tweets in a database. Or twitter's seemingly infinite number. Is this when php is no longer an option? I mean, must take forever to search through every tweet (server-side, not myself).
KULP is offline   Reply With Quote
Old 11-20-2012, 06:11 PM   PM User | #6
billzo
New to the CF scene

 
Join Date: Aug 2012
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
billzo is an unknown quantity at this point
Quote:
Originally Posted by KULP View Post
That's what I was thinking. Now assume I have 20,000 tweets in a database. Or twitter's seemingly infinite number. Is this when php is no longer an option? I mean, must take forever to search through every tweet (server-side, not myself).
MySQL keeps the indexes sorted and will search for the index to retrieve the requested records. It doesn't have to do a full table scan. And 20,000 records is not very much. I have a table with GPS coordinates for about 2,300 records and MySQL can do a full table scan while performing geometric calculations on the GPS coordinates in a few thousandths of a second. This is much more computationally demanding than a basic select query.
billzo is offline   Reply With Quote
Old 11-20-2012, 09:20 PM   PM User | #7
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,556
Thanks: 62
Thanked 4,055 Times in 4,024 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
I used to manage a MySQL DB that *GREW* at the rate of roughly ONE GIGABYTE every 20 minutes. There were several tables in the DB, but probably 90% of each gigabyte was concentrated in one of three tables. And those three tables each had two indexes. And *AT THE SAME TIME* the DB was growing at that rate, we could make a SQL query to find 100 particular records from *EACH* of those three tables, use Java code (not JS) to create a GRAPH of the 2100 data points thus represented (7 data points per record) and present that graph to the use in CONSIDERABLY less than one second.

Oh...did I mention that the database was also trimming old data at the same time, keeping the total DB size to a user-specified size between 20GB and 50GB?

Meaning that the total number of records in those three tables ranged from 100 million to 250 million records at any given time.

And you are worried about a piddling 20,000 records? Really?
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 11-20-2012, 09:32 PM   PM User | #8
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,556
Thanks: 62
Thanked 4,055 Times in 4,024 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
It's a bit of a let down. The current DB I am working with has only two moderately large tables. One about 12 million records, one about 14.5 million.

Hmmm...Let me try a query on that one with 12 million records.

It's a record of page hits by page name for the last 3 years, roughly.

Let me try counting how many hits a given page got.

Okay, done. MySQL took 10 milliseconds to do that. About 12,000 records out of the 12 million...counted in 0.01 seconds.

In the 14.5 million record table, things are organized by zip code. Selecting a count of all records in a single zip code takes 0.02 seconds. 20 milliseconds.

Or how about this one:
Code:
SELECT COUNT(*) FROM tablename WHERE phone LIKE '360%';
A bit more complex, with the LIKE in there, right?

220 milliseconds.

And remember, MySQL is *NOT* even the most efficient database on the planet...by far! SQL Server and Oracle can run rings around it in many situations.

People constantly amaze me by not realizing just how powerful these query engines are! Hundreds of very very smart programmers have poured their life's work into making these things. THEY ARE FAST! Really.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 11-20-2012, 09:37 PM   PM User | #9
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,556
Thanks: 62
Thanked 4,055 Times in 4,024 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
One more example.

Again, from the table with 14.5 million records.

select count(*) from sometable where zip = '98296' AND name like '%clearview%';

Matching on zip code and then, using LIKE, on the name.

Ready?

TEN MILLISECONDS.

Or how about actually *getting* the records, not just a count?

select name, phone from tablename where zip = '98296' AND name like '%clearview%';

STILL TEN MILLISECONDS.

0.01 second.

You could make ONE HUNDRED such queries in ONE SECOND.

Do you think maybe, just maybe, MySQL could handle a table with 20,000 records?
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 11-20-2012, 10:26 PM   PM User | #10
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,556
Thanks: 62
Thanked 4,055 Times in 4,024 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
I will say that if, for example, you wanted to search for (say) the word "people" in the TEXT of all 20,000 tweets, there is no good way to use an INDEX for that with MySQL. So, yes, MySQL would have to actually read all 20,000 records and look in the tweet text field to try to find that word. That is:
Code:
SELECT * FROM tweets WHERE tweet LIKE '%people%'
will not be particularly fast.

BUT...

But say you put an index on the WHENTWEETED field and you knew that the tweet you are looking for was tweeted in the last two days:
Code:
SELECT * FROM tweets WHERE tweet LIKE '%people%' AND whenTweeted > DATE_SUB( NOW(), INTERVAL 2 DAY )
Now MySQL will only need to look at the full test of the tweet in records added in the last 2 days. The selection to find all records in the last 2 days will take perhaps 5 milliseconds and then the time to search the test will depend on how many records were posted in those 2 days.

In other word, take advantage of your indexes! And don't be afraid to add indexes if it looks like they might help.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 11-21-2012, 12:57 AM   PM User | #11
KULP
Regular Coder

 
Join Date: Mar 2012
Posts: 166
Thanks: 5
Thanked 11 Times in 11 Posts
KULP is an unknown quantity at this point
Well no not 20,000, but twitter has billions, and hundreds of millions a day. Anyways, very interesting statistics for me to base off of. Had no idea how powerful mysql actually is!
KULP is offline   Reply With Quote
Old 11-21-2012, 01:14 AM   PM User | #12
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,556
Thanks: 62
Thanked 4,055 Times in 4,024 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
Quote:
Originally Posted by KULP View Post
Well no not 20,000, but twitter has billions, and hundreds of millions a day.
Sure. And more than likely Twitter doesn't allow you to do something like "find me all the tweets made in the last week that have the word people in them." (Though they might! I have seen "full text" search engines that could do that.)

But they *might* allow you to say "Find me all the tweets that @WhiteHouse has made in the last year that contain the work debt."

And that because limiting the search to a single user cuts the millions and billions of records to be searched down to at most a few thousand.

It's all a matter of organizing the data and providing indexes that make the searches reasonably efficient.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Old 11-21-2012, 01:59 PM   PM User | #13
guelphdad
Super Moderator


 
guelphdad's Avatar
 
Join Date: Mar 2006
Location: St. Catharines, Ontario Canada
Posts: 2,629
Thanks: 4
Thanked 147 Times in 138 Posts
guelphdad will become famous soon enoughguelphdad will become famous soon enough
No matter who is in the White House, if they can blame the DEBT on the other guy you'll find lots of posts, if the DEBT is their fault.... not so much.

Happens in Canada too!
guelphdad is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 02:41 PM.


Advertisement
Log in to turn off these ads.