PDA

View Full Version : WHERE processing


thecaligarmo
05-29-2009, 06:04 PM
So I was wondering if MySQL/SQL followed the following principle as other languages:

Say you have a small if statement in PHP:

if($true && $true_too){
execute();
}

In this instance if $true is false, then $true_too is not evaluated and the code moves on.

I was wondering if the same thing happened in MySQL. I.E. Say we have the following mini query:

SELECT * FROM people WHERE person='M' AND height<66;

As MySQL is processing each row, does it always do all WHERE requirements or does it only do one at a time? (so if person!='M' then it moves on to the next row automatically without checking if height<66... or would it check height<66 anyway?)

(I'm also assuming this is going to be the same for all SQL, but it could not... I'm using MyISAM if that helps too)
Thanks!

Old Pedant
05-29-2009, 07:55 PM
You are assuming that the SQL engine processes things a row at a time.

Not necessarily true, at all.

Especially when you use indexed fields in the WHERE clause, it's quite possible (probable, even) that instead the engine will prefer to look in the index, grab the internal record id's (pointers, whatever) for all that match and then make a separate run through all the records looking ONLY at those recordids. Or, if both fields are indexed, it might just get two lists of record ids and then simply do a merge on them.

SQL processing is never guaranteed to be "serial" and the best engines try to do as much as they can in parallel.

In your example, say both person and height are indexed. And the two indexes look like this:

personIndex:
F 773
F 774
F 777
M 775
M 776

heightIndex:
64 773
65 774
65 775
66 776
66 777

So the engine would get two lists:

M 775
M 776

66 776
66 777

and simply do a matching merge on those to arrive at only recordid 776 and only then go get the full contents of that record.

Now, *IF* nothing is indexed and the engine is forced to do a full table scan, then yes, I would imagine that any good engine would do a "shortcut" operation. Unlike other languages, there should not be any side effects associated with evaluating a subexpression, so it's not important that both subexpressions actually be executed.

I should note that all this is from my own experience in working on the internals of an Object Oriented DBMS, not a relational one, but it would be hard to imagine that any modern RDBMS wouldn't do AT LEAST all I have described and likely more.