Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 13 of 13
  1. #1
    Regular Coder
    Join Date
    Mar 2012
    Posts
    168
    Thanks
    5
    Thanked 11 Times in 11 Posts

    How to set up tables. Dynamic content?

    My question involves a scenario, so I'll use twitter as the example. We have a table with user id, pass, etc. Now, we have a table with all tweets? Does each user have their own table with tweets or, when a page loads, does the server search through EVERY tweet from EVERY user and match user ids? I'm just confused how else you would find a specific user's tweets than to search through every tweet. Hope this makes sense, currently learning the basics but stuck on this idea.

  • #2
    Master Coder felgall's Avatar
    Join Date
    Sep 2005
    Location
    Sydney, Australia
    Posts
    6,640
    Thanks
    0
    Thanked 649 Times in 639 Posts
    Make the userid field a part of the primary key for the table and then they can all go in the same table without having to search through everything.
    Stephen
    Learn Modern JavaScript - http://javascriptexample.net/
    Helping others to solve their computer problem at http://www.felgall.com/

    Don't forget to start your JavaScript code with "use strict"; which makes it easier to find errors in your code.

  • #3
    Regular Coder
    Join Date
    Mar 2012
    Posts
    168
    Thanks
    5
    Thanked 11 Times in 11 Posts
    So I have userid for primary key in user table, and then I have a table for 'tweets.' Don't I now have to search through every tweet to match up userid's?

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,537
    Thanks
    77
    Thanked 4,381 Times in 4,346 Posts
    Quote Originally Posted by KULP View Post
    So I have userid for primary key in user table, and then I have a table for 'tweets.' Don't I now have to search through every tweet to match up userid's?
    Yes and no. *YOU* don't have to "search" at all. MySQL does.

    And the trick to making that search efficient is to create an index on the USERID field in the TWEETS table.

    Something like this:
    Code:
    CREATE TABLE tweets (
        tweetid INT AUTO_INCREMENT PRIMARY KEY, /* optional */
        userid INT,
        tweet VARCHAR( 150 ),
        whenTweeted DATETIME,
        CONSTRAINT FOREIGN KEY tweet_users_fk (userid) REFERENCES users(userid),
        INDEX tweet_userid_idx ( userid )
    ) ENGINE=INNODB;
    The CONSTRAINT says that userid is intimately linked to the primary key userid in the users table; the INDEX makes sure that MySQL can get essentially instant access to all tweets, by userid.

    So a typical query might then be:
    Code:
    SELECT T.tweet
    FROM users AS U, tweets A T
    WHERE U.userid = T.userid
    AND U.username = 'John Doe'
    ORDER BY whenTweeted DESC
    LIMIT 10
    to get the most recent 10 tweets by "John Doe".

    And it *will* be a very fast query.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    KULP (11-27-2012)

  • #5
    Regular Coder
    Join Date
    Mar 2012
    Posts
    168
    Thanks
    5
    Thanked 11 Times in 11 Posts
    Quote Originally Posted by Old Pedant View Post
    Yes and no. *YOU* don't have to "search" at all. MySQL does.

    And the trick to making that search efficient is to create an index on the USERID field in the TWEETS table.

    Something like this:
    Code:
    CREATE TABLE tweets (
        tweetid INT AUTO_INCREMENT PRIMARY KEY, /* optional */
        userid INT,
        tweet VARCHAR( 150 ),
        whenTweeted DATETIME,
        CONSTRAINT FOREIGN KEY tweet_users_fk (userid) REFERENCES users(userid),
        INDEX tweet_userid_idx ( userid )
    ) ENGINE=INNODB;
    The CONSTRAINT says that userid is intimately linked to the primary key userid in the users table; the INDEX makes sure that MySQL can get essentially instant access to all tweets, by userid.

    So a typical query might then be:
    Code:
    SELECT T.tweet
    FROM users AS U, tweets A T
    WHERE U.userid = T.userid
    AND U.username = 'John Doe'
    ORDER BY whenTweeted DESC
    LIMIT 10
    to get the most recent 10 tweets by "John Doe".

    And it *will* be a very fast query.
    That's what I was thinking. Now assume I have 20,000 tweets in a database. Or twitter's seemingly infinite number. Is this when php is no longer an option? I mean, must take forever to search through every tweet (server-side, not myself).

  • #6
    New to the CF scene
    Join Date
    Aug 2012
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by KULP View Post
    That's what I was thinking. Now assume I have 20,000 tweets in a database. Or twitter's seemingly infinite number. Is this when php is no longer an option? I mean, must take forever to search through every tweet (server-side, not myself).
    MySQL keeps the indexes sorted and will search for the index to retrieve the requested records. It doesn't have to do a full table scan. And 20,000 records is not very much. I have a table with GPS coordinates for about 2,300 records and MySQL can do a full table scan while performing geometric calculations on the GPS coordinates in a few thousandths of a second. This is much more computationally demanding than a basic select query.

  • #7
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,537
    Thanks
    77
    Thanked 4,381 Times in 4,346 Posts
    I used to manage a MySQL DB that *GREW* at the rate of roughly ONE GIGABYTE every 20 minutes. There were several tables in the DB, but probably 90% of each gigabyte was concentrated in one of three tables. And those three tables each had two indexes. And *AT THE SAME TIME* the DB was growing at that rate, we could make a SQL query to find 100 particular records from *EACH* of those three tables, use Java code (not JS) to create a GRAPH of the 2100 data points thus represented (7 data points per record) and present that graph to the use in CONSIDERABLY less than one second.

    Oh...did I mention that the database was also trimming old data at the same time, keeping the total DB size to a user-specified size between 20GB and 50GB?

    Meaning that the total number of records in those three tables ranged from 100 million to 250 million records at any given time.

    And you are worried about a piddling 20,000 records? Really?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #8
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,537
    Thanks
    77
    Thanked 4,381 Times in 4,346 Posts
    It's a bit of a let down. The current DB I am working with has only two moderately large tables. One about 12 million records, one about 14.5 million.

    Hmmm...Let me try a query on that one with 12 million records.

    It's a record of page hits by page name for the last 3 years, roughly.

    Let me try counting how many hits a given page got.

    Okay, done. MySQL took 10 milliseconds to do that. About 12,000 records out of the 12 million...counted in 0.01 seconds.

    In the 14.5 million record table, things are organized by zip code. Selecting a count of all records in a single zip code takes 0.02 seconds. 20 milliseconds.

    Or how about this one:
    Code:
    SELECT COUNT(*) FROM tablename WHERE phone LIKE '360%';
    A bit more complex, with the LIKE in there, right?

    220 milliseconds.

    And remember, MySQL is *NOT* even the most efficient database on the planet...by far! SQL Server and Oracle can run rings around it in many situations.

    People constantly amaze me by not realizing just how powerful these query engines are! Hundreds of very very smart programmers have poured their life's work into making these things. THEY ARE FAST! Really.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #9
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,537
    Thanks
    77
    Thanked 4,381 Times in 4,346 Posts
    One more example.

    Again, from the table with 14.5 million records.

    select count(*) from sometable where zip = '98296' AND name like '%clearview%';

    Matching on zip code and then, using LIKE, on the name.

    Ready?

    TEN MILLISECONDS.

    Or how about actually *getting* the records, not just a count?

    select name, phone from tablename where zip = '98296' AND name like '%clearview%';

    STILL TEN MILLISECONDS.

    0.01 second.

    You could make ONE HUNDRED such queries in ONE SECOND.

    Do you think maybe, just maybe, MySQL could handle a table with 20,000 records?
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #10
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,537
    Thanks
    77
    Thanked 4,381 Times in 4,346 Posts
    I will say that if, for example, you wanted to search for (say) the word "people" in the TEXT of all 20,000 tweets, there is no good way to use an INDEX for that with MySQL. So, yes, MySQL would have to actually read all 20,000 records and look in the tweet text field to try to find that word. That is:
    Code:
    SELECT * FROM tweets WHERE tweet LIKE '%people%'
    will not be particularly fast.

    BUT...

    But say you put an index on the WHENTWEETED field and you knew that the tweet you are looking for was tweeted in the last two days:
    Code:
    SELECT * FROM tweets WHERE tweet LIKE '%people%' AND whenTweeted > DATE_SUB( NOW(), INTERVAL 2 DAY )
    Now MySQL will only need to look at the full test of the tweet in records added in the last 2 days. The selection to find all records in the last 2 days will take perhaps 5 milliseconds and then the time to search the test will depend on how many records were posted in those 2 days.

    In other word, take advantage of your indexes! And don't be afraid to add indexes if it looks like they might help.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #11
    Regular Coder
    Join Date
    Mar 2012
    Posts
    168
    Thanks
    5
    Thanked 11 Times in 11 Posts
    Well no not 20,000, but twitter has billions, and hundreds of millions a day. Anyways, very interesting statistics for me to base off of. Had no idea how powerful mysql actually is!

  • #12
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,537
    Thanks
    77
    Thanked 4,381 Times in 4,346 Posts
    Quote Originally Posted by KULP View Post
    Well no not 20,000, but twitter has billions, and hundreds of millions a day.
    Sure. And more than likely Twitter doesn't allow you to do something like "find me all the tweets made in the last week that have the word people in them." (Though they might! I have seen "full text" search engines that could do that.)

    But they *might* allow you to say "Find me all the tweets that @WhiteHouse has made in the last year that contain the work debt."

    And that because limiting the search to a single user cuts the millions and billions of records to be searched down to at most a few thousand.

    It's all a matter of organizing the data and providing indexes that make the searches reasonably efficient.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • #13
    Super Moderator guelphdad's Avatar
    Join Date
    Mar 2006
    Location
    St. Catharines, Ontario Canada
    Posts
    2,634
    Thanks
    4
    Thanked 148 Times in 139 Posts
    No matter who is in the White House, if they can blame the DEBT on the other guy you'll find lots of posts, if the DEBT is their fault.... not so much.

    Happens in Canada too!


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •