View Full Version : Need help with mysql 'IN' statement...

12-03-2011, 12:39 AM
Hi folks, I'm no expert with mysql so I'm hoping one of you geniuses will be able to spot the flaw in the statement that I'm about to show:

FROM profiles
WHERE coursearea IN (3)

| id | name | coursearea |
| 1 | John | 3,16 |
| 2 | Ram | 3,16,21 |
| 3 | Jack | 3 |
| 4 | Jill | 3,12,24 |

This gives me 4 results

(Yes I know you will tell me off because one of the cells contains multiple values but bear with me)

If I then change the query to this:

FROM profiles
WHERE coursearea IN (16)

I'm expecting 2 results but not getting any.

What query should I be using to get the correct results?

Old Pedant
12-03-2011, 03:36 AM
Well, you goofed on your DB design and now you pay the price for it.

First of all IN() isn't at all usable for this. Even if it were, you would want to be using something like

WHERE 16 IN ( coursearea )

But that's not how IN works. In can *ONLY* find things in a list of values. And you do not *HAVE* a "list".

It may look like a list to human eyes, but to SQL it is only *one* field. It's just a field that HAPPENS to have commas in it. SQL makes no distinction whatsoever about characters *with* field values. You could as well have had 3$16$21 or 3!16!21 and it would be all the same to the database.

The *PROPER* design for this kind of thing is to use a separate table.

That is, you would have:

TABLE: profiles
| id | name |
| 1 | John |
| 2 | Ram |

TABLE: profileCourseareas
| id | area |
| 1 | 3 |
| 1 | 16 |
| 2 | 3 |
| 2 | 16 |
| 2 | 21 |

Your profileCoursearea table would be defined something like this:

CREATE TABLE profileCoursearea (
id int REFERENCES profiles(id),
area int REFERENCES coursearea(id) /* or whatever the field name is in the coursearea table */

Is it too late to convince you to fix your DB design?

Old Pedant
12-03-2011, 03:44 AM
If you are unwilling to correct the DB design, then the best you can do is something ugly like this:

SELECT * FROM profiles
WHERE CONCAT(',', courseareas,',') LIKE '%,16,%'

If doing this from PHP, for example, you'd code something like

$sql = "SELECT * FROM profiles WHERE CONCAT(',', courseareas,',') LIKE '%," . $areanum . "16,%'";

The reason you need to add the commas to the front and back of both courseareas and the value are looking for is to avoid "false positives".

Suppose, for example, the areanumber you are looking for is 1.

If you didn't do as I show, you'd code something such as

SELECT * FROM profiles WHERE courseareas LIKE '%1%'

But then the '%1%' would match 16 and 21 and 17 and...

With the commas, ,16, matches only ,16,

See why you should change your DB design?

And this is just the tip of the iceberg.

Using YOUR db design, how would you answer a question such as this:

How many students are there that have two or more courseareas in common?

or even

How many students have exactly two courseareas in common and have at least one of them in the range of 21 through 30?

You can't do it with your design. Not in any even close to efficient way.

Whereas with a NORMALIZED design both queries are trivial.

Old Pedant
12-03-2011, 03:51 AM
If you care, the reason that even *THIS* query works

FROM profiles
WHERE coursearea IN (3)
Is because MySQL is so "sloppy" (some would say "forgiving").

What happens: Because your IN() expression was a number (that is, the 3), MySQL attempts to convert all your coursearea field values to a number.

In the process, it does its best to convert strings such as '3,16,21' to a number and succeeds in converting '3' before it encounters the comma, which causes a conversion error. Some other DBs might get this far, but then they would stop on that conversion error and report a type mismatch. MySQL ignores the error and accepts what has been converted so far and so indeed finds that, *AS A NUMBER*, the value '3,16,21' is just 3.

As I said sloppy. I think MySQL is way way too forgiving and so it leads many people (yourself included) into believing that their code is okay when, in fact, you find out later (sometimes years later) that it is not.

12-03-2011, 11:59 AM

Yes, I had a sneaking suspicion that this would be the case. The database has been passed on to me to sort and this was my fear. However, it's not too late to change the design and add in a couple more tables (one for the courseareas and one for the relationship).

Thanks for confirming.

Old Pedant
12-03-2011, 09:56 PM
Whether or not to rebuild the DB is up to you. If it's not used much, and you know you will never have to answer questions such as those I showed you, you can probably get away with leaving it alone.

But if it's used a lot, in the long run you'll find that rebuilding it will pay huge dividends.

12-03-2011, 11:11 PM
Thanks, Have rebuilt the database now and learnt something new. A good day.

Thanks for your help.