Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 11 of 11
  1. #1
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Query help with a poorly structured table

    I have a table and query which is causing performance issues. It was fine when small, but now has about 6 million rows in production and it's growing fast. Before I start rewriting stuff or creating an archive I am wondering if there is a quick win possible here with better indexing and would be very thankful if anyone can help I think there is a quick fix because the query is fast without the "AND visible = 'Y'", but I need help if anyone can please

    My profiling tool tells me "Very bad, very slow: Join uses key prefix or non-unique key. Key compared with = or <=>)".

    The query is:

    Code:
    SELECT type FROM orderhistory WHERE (order_id = '943581' OR ticket_id = '943581') AND visible = 'Y' limit 1
    Table structure:

    Code:
    CREATE TABLE `orderhistory` (
      `uniqueid` int(11) NOT NULL AUTO_INCREMENT,
      `order_id` int(16) NOT NULL,
      `ticket_id` int(16) NOT NULL,
      `type` char(8) NOT NULL,
      `date` int(11) unsigned NOT NULL,
      `notes` text NOT NULL,
      `ontime` char(1) NOT NULL,
      `postedby` varchar(96) NOT NULL,
      `visible` char(1) NOT NULL DEFAULT 'Y',
      `ip` char(16) NOT NULL,
      PRIMARY KEY (`uniqueid`),
      KEY `postedby` (`postedby`),
      KEY `type` (`type`,`ontime`,`date`),
      KEY `order_id` (`order_id`,`type`,`uniqueid`),
      KEY `lookup order` (`order_id`,`ticket_id`,`visible`,`date`),
      KEY `ticket_id` (`ticket_id`),
      KEY `visible` (`visible`)
    ) ENGINE=InnoDB AUTO_INCREMENT=5907756 DEFAULT CHARSET=utf8
    Thanks

  • #2
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,959
    Thanks
    120
    Thanked 76 Times in 76 Posts
    What does this shows ?

    Code:
    EXPLAIN SELECT type FROM orderhistory WHERE (order_id = '943581' OR ticket_id = '943581') AND visible = 'Y' limit 1
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #3
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Hi, that command shows:

    Code:
    id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
    1	SIMPLE	orderhistory	ref	order_id,lookup order,ticket_id,visible	visible	3	const	2709231	Using where

  • #4
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,959
    Thanks
    120
    Thanked 76 Times in 76 Posts
    The first rule whenever such simple query gets slow, this is my huntch, is to make a key on cols in question, that is only on order,ticket_id,visible which is not current case.

    Try that, and see if it helps.

    If explain shows that it does not use that new key of yours, you can force use of your key via sql command, google for that, doh I'm not certain that that works.
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #5
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Thanks -- I just tried your suggestion.

    I indexed: Visible, order_id, ticket_id

    and

    explain SELECT type FROM orderhistory FORCE INDEX (`visible_order_id_ticket_id`) WHERE (order_id = '944704' OR ticket_id = '944704') AND visible = 'Y' limit 1

    Code:
    id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
    1	SIMPLE	orderhistory	ref	visible_order_id_ticket_id	visible_order_id_ticket_id	3	const	1922326	Using where
    we now have a faster query (but it's still scanning 1.9 million rows for some reason).

    If I removed and visible = 'Y' I get a good quality query:

    explain SELECT type FROM orderhistory WHERE (order_id = '944704' OR ticket_id = '944704') limit 1


    Code:
    id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
    1	SIMPLE	orderhistory	index_merge	order_id,lookup order,ticket_id	lookup order,ticket_id	4,4	NULL	5	Using sort_union(lookup order,ticket_id); Using where
    What am I missing? I've googled so hard for the solution here.

    I should have paid more attention in database class

  • #6
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,959
    Thanks
    120
    Thanked 76 Times in 76 Posts
    just a thought:

    order_id = '944704' OR ticket_id = '944704'
    order_id = 944704 OR ticket_id = 944704

    would do the same, mybe mysql is doing some conversion there

    also I would try

    `visible` char(1) NOT NULL DEFAULT 'Y',
    instead try:
    `visible` INT NOT NULL DEFAULT 1//as 1 = 'Y', 0 = 'N'


    EDIT: since you mentioned database class, do you have connection pooling ?
    If not, opening connection each time can eat substantial amount of that x second of yours.
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #7
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Hi,

    Thanks again for the input. I just tried it with INT matching as you suggested but this benchmarks the same as with the apostrophes so I guess mysql is correcting the query before processing.

    Also, I just tried your other idea creating another column called visible2 and populated that with INT 1 /0, added index but the query still scans 2 million rows.

    I am not sure about connection pooling - that's new to me - I'll read up on it, but I am only connecting to the DB one time per pageview. I'm using Amazon AWS RDS on default settings. Btw, this query is run only occassionally and I then cache it in memcached to prevent repeat visits from hammering the DB. But, 80% are cache misses as they are new customers. It's just a sloooow query, and I need that visible flag.

    Thanks for your help so far

  • #8
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts
    So, I just thought of a really basic work around to reduce production load for now --

    1) query for order_id = X AND visible2 = 1
    2) query for ticket_id = X AND visible2 = 1
    3) if either exists, continue

    This solves it, I guess, but I'd really like to learn the correct solution with Mysql without kludging it with 2 separate queries

  • #9
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Guys, clutching at straws here, but I feel like I've exhausted my possibilities (or at least my time) trying to fix this. I don't suppose upgrading from 5.1.57 to 5.5.x is likely to just solve this with a better preprocesor , is it? Or am I still still misunderstanding mysql fundamentals?

    Thanks again

  • #10
    Supreme Master coder! Old Pedant's Avatar
    Join Date
    Feb 2009
    Posts
    25,032
    Thanks
    75
    Thanked 4,325 Times in 4,291 Posts
    I think you have the essence of the problem: MySQL can't decide WHICH index to use when you are looking at an OR condition.

    One of the HUGE problems with MySQL is that it can *NOT* use just a PART of an index unless that PART is at the beginning of the index.

    So if you had (for example) an index on ( orderi_id, ticket_id, visible2 ) [all 3 fields], MySQL would NOT be able to use that index with your OR where clause *except* for order_id. It wouldn't be able to use ( order_id, visibie2 ) because the "parts" there are not contiguous. It wouldn't be able to use ( ticket_id ) at all, because that "part" isn't at the start of the index.

    It's a real deficiency in MySQL, not shared by all other DBs.

    So your only real choice is to have two indexes: ( order_id, visible2 ) and ( ticket_id, visible_2 ) and then, as you noted, make two separate queries.

    Now..

    You *could* do:
    Code:
    SELECT U.* FROM 
       ( SELECT * FROM orderhistory WHERE order_id = 555444 AND visible2 =1
         UNION
         SELECT * FROM orderhistory WHERE ticket_id = 555444 AND visible2 =1
       ) AS U
    And that should work, assuming you ndeed have two indexes: (order_id, visible2) and (ticket_id,visible2).

    But it won't likely be the most efficient way. Assume, for the sake of argument, that indeed you find a match based on order_id and visible2. With the UNION query, MySQL will *still* have to make the second query based on ticket_id.

    So, indeed, the two separate queries is likely the best choice. If you have *ANY* inkling which is more likely to match--ticket_id or order_id--then make that query first, so you only have to make the second less likely query if you don't get a match on the first.

    NOTE: You *could* do all this in a STORED PROCEDURE. And if you really have no idea which is most likely to match (ticket_id or order_id) then that would be a fine choice. But, again, if you do have (say) some way of guessing which is more likely in any given circumstance, you'd be better off with the two separate queries, in most-likely order.
    An optimist sees the glass as half full.
    A pessimist sees the glass as half empty.
    A realist drinks it no matter how much there is.

  • Users who have thanked Old Pedant for this post:

    shoryuken (11-26-2012)

  • #11
    New Coder
    Join Date
    May 2012
    Posts
    17
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Old Pedant, thanks very much for the comprehensive response. I totally get it now - this all makes sense now.

    I really appreciate you explaining this concept to me.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •