Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    New Coder
    Join Date
    Sep 2012
    Posts
    76
    Thanks
    61
    Thanked 0 Times in 0 Posts

    Multi-column indexing?

    Hello I'm trying to get to grips with indexing and for the majority of it I think I understand it but am confused by one thing which is When to use multi-column indexing and when not to. So I have a few questions here please.

    Question A
    Say I have a query like so

    Code:
    SELECT starts FROM table WHERE starts > '2013-03-11'
    So here from what I understand I would index the starts column
    But then also on that page I have another query later on which is

    Code:
    SELECT starts, ends FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12'
    So here would I create a multi-column index called say startsends even though I have already created an index on starts already earlier?

    Question B
    With this query again if it was adapted to be say

    Code:
    SELECT starts, ends, active FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12' AND active = 1
    Would it be best to create a multi-column index called say called startsends_active (on 3 columns)
    OR with one index created called startsends and one index called active?
    OR index them all individually?

    Question C

    Also, one last thing say I have a query like so with AND & OR

    Code:
    SELECT first_name, last_name, age FROM table WHERE (first_name = '???' AND last_name = '???') OR age > 31
    What would you index here? Would you use a multi-column index on all three conditions together? Thank you
    Last edited by Oatley; 03-12-2013 at 08:01 PM.

  • #2
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,155
    Thanks
    75
    Thanked 4,338 Times in 4,304 Posts
    A. So here would I create a multi-column index called say startsends even though I have already created an index on starts already earlier?
    Yes. But then drop the index on starts. MySQL will use the multi-column index for starts alone, provided that starts is listed first in the index definition. (So if you had a query based on ends, alone, you would need a separate index for ends.)

    MySQL can only use partial multi-column indexes in the order the columns are defined.

    Say you had an index on (col1, col7, col3). Then MySQL could use that index for col1. Or for col1 and col7. Or for col1 and col7 and col3. But it could not use it for col7 or col3 alone nor for col7 and col3 without col1. It's a fair stupid limitation, but if you are aware of it, you can plan for it.

    ***********

    B. The three column index will be the most efficient. But if you often make queries on (say) just starts and active then you would want active to be a separate index, as describe above.

    *EXCEPT*. Except if active is a 1/0 (true/false) column, it doesn't really make sense to index it separately if there are roughly the same number of 1's and 0's. You would only index it if, say, there were only 5% 1's and you often queried for the 1's. There would be no point in indexing it if there were 95% 1's and you almost always queried for 1's: You will end up having to fetch 95% of the full records, anyway. In general, I don't index boolean fields. There are exceptions, and tacking a boolean onto the end of a multi-column key can and does make sense.

    *************

    C. Kind of too general a question. If, say, 70% of your records have age > 31, then there's no point in indexing, at all. If 5% do, then yes. And whether it is multi-column or not depends on what other queries you will do.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    Oatley (03-13-2013)

  • #3
    New Coder
    Join Date
    Sep 2012
    Posts
    76
    Thanks
    61
    Thanked 0 Times in 0 Posts
    That's an excellent explanation and I thank you very much for the posting here. Can I check something though please that I am not sure about. In the following query
    Code:
    SELECT starts, ends, name FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12' AND name = 'bob'
    Then I created an index called idx_starts_ends_name Am I right in thinking this index would then cover any queries on my page like

    - SELECT starts, ends, name FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12' AND name = 'bob'
    - SELECT starts FROM table WHERE starts > '2013-03-11'
    - SELECT starts, ends FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12'

    But would it cover a queries like?

    A) SELECT ends FROM table WHERE ends < '2013-03-11'
    (I guess i'd have to create an index called idx_ends here as well?)

    B) SELECT ends, name FROM table WHERE ends < '2013-03-11' AND name = 'bob'
    (I guess i'd have to create an index called idx_ends_name here as well and remove the index I created for (A)?

    C) SELECT starts, name FROM table WHERE starts > '2013-03-11' AND name = 'bob'
    (I guess i'd have to create an index called idx_starts_name here as well?)

    Thank you

  • #4
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,155
    Thanks
    75
    Thanked 4,338 Times in 4,304 Posts
    Quote Originally Posted by Oatley View Post
    In the following query
    Code:
    SELECT starts, ends, name FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12' AND name = 'bob'
    Then I created an index called idx_starts_ends_name Am I right in thinking this index would then cover any queries on my page like

    - SELECT starts, ends, name FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12' AND name = 'bob'
    - SELECT starts FROM table WHERE starts > '2013-03-11'
    - SELECT starts, ends FROM table WHERE starts > '2013-03-11' AND ends < '2013-03-12'
    Yes, to all of the above.

    But would it cover a queries like?
    A) SELECT ends FROM table WHERE ends < '2013-03-11'
    (I guess i'd have to create an index called idx_ends here as well?)
    Yes.
    B) SELECT ends, name FROM table WHERE ends < '2013-03-11' AND name = 'bob'
    (I guess i'd have to create an index called idx_ends_name here as well and remove the index I created for (A)?
    You could, but given (C) I would probably just add a separate index for name.
    C) SELECT starts, name FROM table WHERE starts > '2013-03-11' AND name = 'bob'
    (I guess i'd have to create an index called idx_starts_name here as well?)
    See, this is when you have to actually do some performance measurements and think hard about what your most common queries will be.

    If you really will have, pretty much equally, all kinds of queries involving starts, ends, and name, in various combinations, then it's likely that the best solution is just 3 separate indexes, one each on starts and on ends and on name.

    Probably, the only reason you would create a multiple column index is because your really will be making a lot of queries that use *ALL* the columns of the index. Or, perhaps, a lot using the first two out of three, etc.

    There's no one-size-fits-all answer. You can easily over-index a table. If it's a table that is updated a lot, that's especially true: All the advantage you gain in SELECT performance you lose in INSERT and UPDATE performance, as MySQL has to continually update all those indexes.

    Data base indexing--especially with MySQL--is as much an art as it is a science. Don't be afraid to revise your indexing decisions if testing indicates you have a bottleneck in some area.



    Thank you[/QUOTE]
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •