Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 11-24-2012, 05:45 PM   PM User | #1
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
Query help with a poorly structured table

I have a table and query which is causing performance issues. It was fine when small, but now has about 6 million rows in production and it's growing fast. Before I start rewriting stuff or creating an archive I am wondering if there is a quick win possible here with better indexing and would be very thankful if anyone can help I think there is a quick fix because the query is fast without the "AND visible = 'Y'", but I need help if anyone can please

My profiling tool tells me "Very bad, very slow: Join uses key prefix or non-unique key. Key compared with = or <=>)".

The query is:

Code:
SELECT type FROM orderhistory WHERE (order_id = '943581' OR ticket_id = '943581') AND visible = 'Y' limit 1
Table structure:

Code:
CREATE TABLE `orderhistory` (
  `uniqueid` int(11) NOT NULL AUTO_INCREMENT,
  `order_id` int(16) NOT NULL,
  `ticket_id` int(16) NOT NULL,
  `type` char(8) NOT NULL,
  `date` int(11) unsigned NOT NULL,
  `notes` text NOT NULL,
  `ontime` char(1) NOT NULL,
  `postedby` varchar(96) NOT NULL,
  `visible` char(1) NOT NULL DEFAULT 'Y',
  `ip` char(16) NOT NULL,
  PRIMARY KEY (`uniqueid`),
  KEY `postedby` (`postedby`),
  KEY `type` (`type`,`ontime`,`date`),
  KEY `order_id` (`order_id`,`type`,`uniqueid`),
  KEY `lookup order` (`order_id`,`ticket_id`,`visible`,`date`),
  KEY `ticket_id` (`ticket_id`),
  KEY `visible` (`visible`)
) ENGINE=InnoDB AUTO_INCREMENT=5907756 DEFAULT CHARSET=utf8
Thanks
shoryuken is offline   Reply With Quote
Old 11-24-2012, 10:48 PM   PM User | #2
BubikolRamios
Senior Coder

 
Join Date: Dec 2005
Location: Slovenia
Posts: 1,876
Thanks: 114
Thanked 76 Times in 76 Posts
BubikolRamios is on a distinguished road
What does this shows ?

Code:
EXPLAIN SELECT type FROM orderhistory WHERE (order_id = '943581' OR ticket_id = '943581') AND visible = 'Y' limit 1
__________________
Found a flower or bug and don't know what it is ?
agrozoo.net galery
if you don't spot search button at once, there is search form:
agrozoo.net galery search
BubikolRamios is offline   Reply With Quote
Old 11-25-2012, 04:20 AM   PM User | #3
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
Hi, that command shows:

Code:
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	orderhistory	ref	order_id,lookup order,ticket_id,visible	visible	3	const	2709231	Using where
shoryuken is offline   Reply With Quote
Old 11-25-2012, 10:43 AM   PM User | #4
BubikolRamios
Senior Coder

 
Join Date: Dec 2005
Location: Slovenia
Posts: 1,876
Thanks: 114
Thanked 76 Times in 76 Posts
BubikolRamios is on a distinguished road
The first rule whenever such simple query gets slow, this is my huntch, is to make a key on cols in question, that is only on order,ticket_id,visible which is not current case.

Try that, and see if it helps.

If explain shows that it does not use that new key of yours, you can force use of your key via sql command, google for that, doh I'm not certain that that works.
__________________
Found a flower or bug and don't know what it is ?
agrozoo.net galery
if you don't spot search button at once, there is search form:
agrozoo.net galery search
BubikolRamios is offline   Reply With Quote
Old 11-25-2012, 12:39 PM   PM User | #5
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
Thanks -- I just tried your suggestion.

I indexed: Visible, order_id, ticket_id

and

explain SELECT type FROM orderhistory FORCE INDEX (`visible_order_id_ticket_id`) WHERE (order_id = '944704' OR ticket_id = '944704') AND visible = 'Y' limit 1

Code:
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	orderhistory	ref	visible_order_id_ticket_id	visible_order_id_ticket_id	3	const	1922326	Using where
we now have a faster query (but it's still scanning 1.9 million rows for some reason).

If I removed and visible = 'Y' I get a good quality query:

explain SELECT type FROM orderhistory WHERE (order_id = '944704' OR ticket_id = '944704') limit 1


Code:
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	orderhistory	index_merge	order_id,lookup order,ticket_id	lookup order,ticket_id	4,4	NULL	5	Using sort_union(lookup order,ticket_id); Using where
What am I missing? I've googled so hard for the solution here.

I should have paid more attention in database class
shoryuken is offline   Reply With Quote
Old 11-25-2012, 02:58 PM   PM User | #6
BubikolRamios
Senior Coder

 
Join Date: Dec 2005
Location: Slovenia
Posts: 1,876
Thanks: 114
Thanked 76 Times in 76 Posts
BubikolRamios is on a distinguished road
just a thought:

order_id = '944704' OR ticket_id = '944704'
order_id = 944704 OR ticket_id = 944704

would do the same, mybe mysql is doing some conversion there

also I would try

`visible` char(1) NOT NULL DEFAULT 'Y',
instead try:
`visible` INT NOT NULL DEFAULT 1//as 1 = 'Y', 0 = 'N'


EDIT: since you mentioned database class, do you have connection pooling ?
If not, opening connection each time can eat substantial amount of that x second of yours.
__________________
Found a flower or bug and don't know what it is ?
agrozoo.net galery
if you don't spot search button at once, there is search form:
agrozoo.net galery search
BubikolRamios is offline   Reply With Quote
Old 11-25-2012, 04:32 PM   PM User | #7
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
Hi,

Thanks again for the input. I just tried it with INT matching as you suggested but this benchmarks the same as with the apostrophes so I guess mysql is correcting the query before processing.

Also, I just tried your other idea creating another column called visible2 and populated that with INT 1 /0, added index but the query still scans 2 million rows.

I am not sure about connection pooling - that's new to me - I'll read up on it, but I am only connecting to the DB one time per pageview. I'm using Amazon AWS RDS on default settings. Btw, this query is run only occassionally and I then cache it in memcached to prevent repeat visits from hammering the DB. But, 80% are cache misses as they are new customers. It's just a sloooow query, and I need that visible flag.

Thanks for your help so far
shoryuken is offline   Reply With Quote
Old 11-25-2012, 04:39 PM   PM User | #8
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
So, I just thought of a really basic work around to reduce production load for now --

1) query for order_id = X AND visible2 = 1
2) query for ticket_id = X AND visible2 = 1
3) if either exists, continue

This solves it, I guess, but I'd really like to learn the correct solution with Mysql without kludging it with 2 separate queries
shoryuken is offline   Reply With Quote
Old 11-25-2012, 11:51 PM   PM User | #9
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
Guys, clutching at straws here, but I feel like I've exhausted my possibilities (or at least my time) trying to fix this. I don't suppose upgrading from 5.1.57 to 5.5.x is likely to just solve this with a better preprocesor , is it? Or am I still still misunderstanding mysql fundamentals?

Thanks again
shoryuken is offline   Reply With Quote
Old 11-26-2012, 08:12 PM   PM User | #10
Old Pedant
Supreme Master coder!

 
Old Pedant's Avatar
 
Join Date: Feb 2009
Posts: 23,249
Thanks: 59
Thanked 3,999 Times in 3,968 Posts
Old Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to allOld Pedant is a name known to all
I think you have the essence of the problem: MySQL can't decide WHICH index to use when you are looking at an OR condition.

One of the HUGE problems with MySQL is that it can *NOT* use just a PART of an index unless that PART is at the beginning of the index.

So if you had (for example) an index on ( orderi_id, ticket_id, visible2 ) [all 3 fields], MySQL would NOT be able to use that index with your OR where clause *except* for order_id. It wouldn't be able to use ( order_id, visibie2 ) because the "parts" there are not contiguous. It wouldn't be able to use ( ticket_id ) at all, because that "part" isn't at the start of the index.

It's a real deficiency in MySQL, not shared by all other DBs.

So your only real choice is to have two indexes: ( order_id, visible2 ) and ( ticket_id, visible_2 ) and then, as you noted, make two separate queries.

Now..

You *could* do:
Code:
SELECT U.* FROM 
   ( SELECT * FROM orderhistory WHERE order_id = 555444 AND visible2 =1
     UNION
     SELECT * FROM orderhistory WHERE ticket_id = 555444 AND visible2 =1
   ) AS U
And that should work, assuming you ndeed have two indexes: (order_id, visible2) and (ticket_id,visible2).

But it won't likely be the most efficient way. Assume, for the sake of argument, that indeed you find a match based on order_id and visible2. With the UNION query, MySQL will *still* have to make the second query based on ticket_id.

So, indeed, the two separate queries is likely the best choice. If you have *ANY* inkling which is more likely to match--ticket_id or order_id--then make that query first, so you only have to make the second less likely query if you don't get a match on the first.

NOTE: You *could* do all this in a STORED PROCEDURE. And if you really have no idea which is most likely to match (ticket_id or order_id) then that would be a fine choice. But, again, if you do have (say) some way of guessing which is more likely in any given circumstance, you'd be better off with the two separate queries, in most-likely order.
__________________
An optimist sees the glass as half full.
A pessimist sees the glass as half empty.
A realist drinks it no matter how much there is.
Old Pedant is offline   Reply With Quote
Users who have thanked Old Pedant for this post:
shoryuken (11-26-2012)
Old 11-26-2012, 08:21 PM   PM User | #11
shoryuken
New Coder

 
Join Date: May 2012
Posts: 17
Thanks: 2
Thanked 0 Times in 0 Posts
shoryuken is an unknown quantity at this point
Old Pedant, thanks very much for the comprehensive response. I totally get it now - this all makes sense now.

I really appreciate you explaining this concept to me.
shoryuken is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:39 AM.


Advertisement
Log in to turn off these ads.