View Full Version : SQL syntax problem
Mhtml
02-16-2006, 11:49 AM
Is it possible to reference another table without a WHERE clause? Or perhaps a conditional WHERE clause or something?
I have a table, images. I have another table, rides. In rides I have several fields which may OR may NOT specify index's into the images table..
Now I know I can do:
"SELECT i.path AS img1 FROM images AS i, rides AS r WHERE r.img1 = i.id"
r.img1 always corresponds to an index in the images table, however r.img2, r.img3 and r.img4 may be 0. I need to resolve i.path for these as well if they are > 0.. How can I achieve this?
I hope I explained this well enough.
[edit:] I guess what I'm looking for is like, a multiple, conditional INNER JOIN sort of thing.. Can this be done?
ronaldb66
02-16-2006, 01:47 PM
Er... I hope I explained this well enough. Not for my tired brain, I'm afraid.
Could you give away a bit more about these two tables, the columns involved and how they relate to oneanother? I'm having a really hard time figuring out what you're storing and what you're actually trying to retrieve.
Mhtml
02-16-2006, 01:59 PM
Yeah, sorry.. I should have just posted the structure.
Hopefully you'll understand my psuedo:
images:
id int
path text
rides:
id int
img1 int
img2 int
img3 int
rides.img1 is always an index into images, always.
rides.img2 and rides.img3 however may be 0 or may be an images.id ... I need to resolve images.path for each rides.img2 and rides.img3. So basically I need to select several rows in images, minimum of 1 (for rides.img1 which defaults to 1) or may be 3 rows total.
ronaldb66
02-16-2006, 02:11 PM
Okay, clear; let me put that one on the back burner for a little while: I feel something stewing, but it's still a bit raw... :D I'll get back to you when it's tender enough! ;)
Mhtml
02-16-2006, 02:21 PM
Haha, cheers. No real rush. :thumbsup:
ronaldb66
02-16-2006, 02:51 PM
Hmmm...
You might try these:
select path from images
where id = (select img1 from rides)
or id = (select img2 from rides where img2 <> 0)
or id = (select img3 from rides where img3 <> 0)
or
select i.path from images i, rides r
where i.id = r.img1
or (r.img2 <> 0 and i.id = r.img2)
or (r.img3 <> 0 and i.id = r.img3)
BE AWARE : I haven't tested any of these!!!
Also, you may have to correct the syntax; I'm really a IBM/DB2 guy at heart, I just sort of "transplanted" my knowledge to MySQL.
If I find the time I will actually try this out; I like a challenge... ;)
Mhtml
02-16-2006, 03:03 PM
Ah subqueries... good idea, I'll give em a whirl and see what comes of it.
ronaldb66
02-16-2006, 03:24 PM
Okay: check this out (still IBM/DB2 on a mainframe, no MySQL!):
IMAGES table:
SELECT ID, PATH FROM IMAGES
;
---------+---------+-------
ID PATH
---------+---------+-------
1 image1
2 image11
3 image2
4 image3
5 image33
6 image4
7 image44
8 image5
9 image6
10 image7
11 image8
12 image88
RIDES table:
SELECT ID, IMG1, IMG2, IMG3 FROM RIDES
;
---------+---------+---------+---------+---------+
ID IMG1 IMG2 IMG3
---------+---------+---------+---------+---------+
1 1 0 0
2 3 4 0
3 6 0 8
4 9 10 11
Query plus result:
SELECT PATH FROM IMAGES
WHERE ID IN (SELECT IMG1 FROM RIDES)
OR ID IN (SELECT IMG2 FROM RIDES WHERE IMG2 <> 0)
OR ID IN (SELECT IMG3 FROM RIDES WHERE IMG3 <> 0)
;
---------+---------+---------+---------+---------+---
PATH
---------+---------+---------+---------+---------+---
image1
image2
image3
image4
image5
image6
image7
image8
Note: mind the "in" instead of "=" in my original example; this may differ in MySQL, but DB2 started whining about a subselect returning multiple rows.
Also, each image appears once in the example; if an image appears more then once, you'll get multiple entries in the result.
And this one:
SELECT I.PATH FROM IMAGES I, RIDES R
WHERE I.ID = R.IMG1
OR (R.IMG2 <> 0 AND I.ID = R.IMG2)
OR (R.IMG3 <> 0 AND I.ID = R.IMG3)
;
---------+---------+---------+--------
PATH
---------+---------+---------+--------
image1
image2
image3
image4
image5
image6
image7
image8
seems to work equally well. Might be interesting to see which one performs better, though.
Mhtml
02-16-2006, 03:29 PM
I don't get any errors, after a limiting the rows to r.id=$rid that is... But I don't get the paths either, just the integers still.
[edit:] Oops, didn't see your new post.. I'll read it/get back to you.
i'd suggst getting your db-design normalised. repeating the images in the columns is a classic error and it's not the way to go (as you probably know). why not move these into a factstable with 1 row for each imag-ride combination?
then it's a simple three table join...
Mhtml
02-16-2006, 03:35 PM
Yeah, same result. No errors, but I don't get anything returned.
[edit:]Perhaps adding a ride_id in the images table you mean?
ronaldb66
02-16-2006, 03:43 PM
What does your query look like? DB2 doesn't mind columns used in the where clause not actually being selected in the result, but maybe MySQL does.
@raf:
Yeah, I thought about that too, but when it's limited to only three columns, that may be over-normalising.
[edit:]Perhaps adding a ride_id in the images table you mean?
if this is directed to me --> no. the normalised design would be
images:
id int primary key
other images stuff
rides:
id int primary key
other rides stuff
images_rides:
image_id int foreign key referencing image.id
rides_id int foreign key referencing rides.id
other suff that is associated with the image-ride combination
images_rides then contains 1 record for each image-ride combination.
Mhtml
02-16-2006, 03:47 PM
My query was exactly your query. I just pasted it into sql query in phpmyadmin.
Mhtml
02-16-2006, 03:49 PM
Ahh I see, all this formal database language I've never encountered before. Never actually had to do much with databases apart from maybe like a simple join. Perhaps that is the best way to do it, at least I can then have the option of having a lot more images associated with each ride. Hmm.
Yeah, I thought about that too, but when it's limited to only three columns, that may be over-normalising.
sure. given you now only need 3 subselects to get the info :)
it always starts like that: it's only for three columns so it would be overkill to normalise it etc. but from personal experience, i've learned that handling repeated FK's in columns is never a good idea...
Mhtml
02-16-2006, 04:03 PM
Ok, so I think I need to start again on this. Everything just went right over my head. So raf, let's take your approach.
images:
id int primary_key
rides:
id int primary_key
image_rides:
img_id int foreign_key
ride_id int foreign_key
So say I have this. When it comes to needing pretty much all the information from rides, and all images associated with a rides, how would my sql query look? Roughly...
ronaldb66
02-16-2006, 04:08 PM
@raf:
True, true, but Mhtml's "data model" was a bit thin to base such assumptions on--although he actually seems to just want to relate one or more images to a ride, in which case an image-ride lookup table definitely is the way to go.
@Mhtml:
I can then have the option of having a lot more images associated with each ride. Exactly: you only have to create those relations between rides and images that you actually need; no more empty columns!
By the way: did I interpret your earlier data model correctly? I assumed img1, img2 and img3 contain the id's in the images table (if not 0). Again, you may need to add images.id, rides.img1, rides.img2 and rides.img3 to the select to please MySQL.
Or, first work out the solution suggested by raf so we can mull over an entirely different query. :D
Mhtml
02-16-2006, 04:12 PM
Hah, I can see why you like this sql stuff.. It actually requires a bit of thinking. I'll see how I go working out raf's way.
So say I have this. When it comes to needing pretty much all the information from rides, and all images associated with a rides, how would my sql query look? Roughly...
your query would be something like
SELECT rides.id, rides.othercolumn1, rides.othercolumn2, images.id, images.path
FROM (images_rides INNER JOIN rides ON images_rides.ride_id=rides.id) INNER JOIN images ON images_rides.img_id=images.id
<edit>i of course used some possibly non-existing column, but the only thing you realy need to look at is the joining. you can of course add a where claus like WHERE rides.id = 123456 or an ORDER BY or whatever.</edit>
Mhtml
02-16-2006, 04:26 PM
Ok, so doing that will give me a lot of rows containing all the same data except for path which will be different in each row. Now my problem, and quite possibly the reason I used 3 columns to begin with, is that I need just a single row with all that same information.
Hah, I can see why you like this sql stuff.. It actually requires a bit of thinking.
indeed :D
without being to nerdy: db-design is an intresting field since you can first mess around a bit until you start realising that following some designprinciples (like db-normalisation) actually creates db's that are faster, easier to query and have less limitation. and then you get into a phase where you see that these following these principles isn't allways the best sollution and that you sometimes need to diverge from them to better meet your clients needs.
and then you get to the point where you need to know each db-servers particular features and optimalisations to be able to work out the most performant and least restrictive model.
but for this situation, i would instantly choose for a normalised design with 2 dimensiontables (images and rides) and a facttable (images_rides).
Ok, so doing that will give me a lot of rows containing all the same data except for path which will be different in each row. Now my problem, and quite possibly the reason I used 3 columns to begin with, is that I need just a single row with all that same information.
well, you just throw out the columns you don't need from the fieldlist, add a GROUP BY clause and use a GROUP CONCAT()
like
SELECT rides.id, GROUP_CONCAT(images.path SEPARATOR ', ')
FROM (images_rides INNER JOIN rides ON images_rides.ride_id=rides.id) INNER JOIN images ON images_rides.img_id=images.id GROUP BY rides.id
<edit> or if you need all other columns as well
SELECT rides.id, rides.othercolumn1, rides.othercolumn2, images.id, GROUP_CONCAT(images.path SEPARATOR ', ')
FROM (images_rides INNER JOIN rides ON images_rides.ride_id=rides.id) INNER JOIN images ON images_rides.img_id=images.id GROUP BY rides.id, rides.othercolumn1, rides.othercolumn2, images.id
Note: for mysql, this would work as well, but i would recommend using the full group by:
SELECT rides.id, rides.othercolumn1, rides.othercolumn2, images.id, GROUP_CONCAT(images.path SEPARATOR ', ')
FROM (images_rides INNER JOIN rides ON images_rides.ride_id=rides.id) INNER JOIN images ON images_rides.img_id=images.id GROUP BY rides.id </edit>
Mhtml
02-16-2006, 04:43 PM
:eek: geez, that's pretty sweet! All these new things I've never seen before, I can't sleep now and it's nearly 3am... I'm going to be up all night!
[edit:] Ok, so I have a problem now.. What if ride doesn't have any images associated?
Mhtml
02-16-2006, 05:20 PM
Alright, now I've needed to add a lot more information to my page. And my query looks like this:
SELECT rides.id, rides.year, rides.longdesc, rides.kms, rides.price, rides.color, GROUP_CONCAT(
images.path) AS imgs, u.firstname, u.lastname, u.state, u.suburb, u.phone, u.mobile, mks.name AS make, mdl.name AS model
FROM userinfo AS u, manfs AS mks, models AS mdl, (
images_rides
INNER JOIN rides ON images_rides.ride_id = rides.id
)
INNER JOIN images ON images_rides.img_id = images.id
WHERE rides.id=$rid AND mks.id=rides.make AND mdl.id = rides.model GROUP BY rides.id"
Now I'm used to just simple, single line queries... Is this query bad? Am I asking too much? Can I ask a lot more? What should I think about doing to optimize it?
:eek: geez, that's pretty sweet! All these new things I've never seen before, I can't sleep now and it's nearly 3am... I'm going to be up all night!
[edit:] Ok, so I have a problem now.. What if ride doesn't have any images associated?
No problem there. You just use a outer join. That way, there'll always be at least one record that is returned, before the group by is executed.
Mhtml
02-16-2006, 08:23 PM
Is that the same as left join?
Alright, now I've needed to add a lot more information to my page. And my query looks like this:
SELECT rides.id, rides.year, rides.longdesc, rides.kms, rides.price, rides.color, GROUP_CONCAT(
images.path) AS imgs, u.firstname, u.lastname, u.state, u.suburb, u.phone, u.mobile, mks.name AS make, mdl.name AS model
FROM userinfo AS u, manfs AS mks, models AS mdl, (
images_rides
INNER JOIN rides ON images_rides.ride_id = rides.id
)
INNER JOIN images ON images_rides.img_id = images.id
WHERE rides.id=$rid AND mks.id=rides.make AND mdl.id = rides.model GROUP BY rides.id"
Now I'm used to just simple, single line queries... Is this query bad? Am I asking too much? Can I ask a lot more? What should I think about doing to optimize it?
Well, i don't know your app so maybe there would be better wys to hande this. Like do you need to get all the info in one query? and couldn't you have collected some of the info on previous page etc?
My main comment on the query would be to not mix the two ways of defining your joins --> either use the 'modern' INNER JOIN etc kind of syntax, or either use the WHERE-clause form. But don't mix the two.
As for optimization --> makes sure that all fields that you use to join the tables + the rides.id are of a numeric type + that they are indexed. If the tabls are indexed and sorted, then there probably isn't any performanceissue (unless some of these tables are realy big...)
Is that the same as left join?
it can be a LEFT OUTER JOIN or a RIGHT OUTER JOIN
you would need
SELECT ... FROM (rides LEFT OUTER JOIN rides_images ON ...) INNER JOIN images ON ...
this way, all records from rides (that meet your where clause of course...) would be included.
but you need to realise that outer joins are much slower then inner joins, so it might be more performant to use 2 qeurys (1 with the columns from rides, models, makes (all inner joins but without the 'group by') and 1 with the columns from rides_images and images (inner join and group by') I don't quite see how you get the userinfo mixed in there...
but again, if your tables aren't to big and properly indexed, then your query will surely be performant enough.
Mhtml
02-16-2006, 09:15 PM
Ahah. I just read a crap load on sql syntax so I think I get where you're coming from now. I'll just have to see how I go, hopefully I'll be able to make it work....granted, it's now 7:15am and I still haven't slept but I'll keep at it...lol but now I can't sleep anyway, need to tell my sister to grab me some absinth from the bottleshop before coming home from work...
[edit:] Also, what's the general consensus on multiple select queries for related data? I mean if it is possible to get all the data in the same query, does it just come down to optimization if you either do just that or use a second query? Or should it just be done within the first query? Will the time taken to query the second time and return the results negate the performance of a single query? Or does this again just come down to trial and error optimization?
need to tell my sister to grab me some absinth from the bottleshop before coming home from work...
seems ypu've already been drinking some (refering to your edit below) --> remember to let is drip slowly though your sugarcube!
[edit:] Also, what's the general consensus on multiple select queries for related data?
I'd say it's to keep the number of querys as low as possibe.
If you frequently need to join the same (high) number of tables or if it's a complex join, then you'd normally define a view, and just select from that. Doesn't make the querys more performant (--> usually the views are to large so this reduces the speed) but it makes the sql-statement simpler and guarantees that you always define the joins correctly.
It also enables you to change the structure of the underlying tables, without you needing to change your selectstatements. You then only need to change the view-definition.
You also need to realize that the current db-servers are so fast, that you can tolerate some inefficiency, simply because you'll never notice it in the applications you're probably building.
I mean if it is possible to get all the data in the same query, does it just come down to optimization if you either do just that or use a second query? Or should it just be done within the first query? Will the time taken to query the second time and return the results negate the performance of a single query? Or does this again just come down to trial and error optimization?
Euh...
1 query will probably be faster. But the difference between 1 'inefficient' query with the outer join pulling data from 5 tables, and 2 querys with all inner joins and a lower number of tables for each query probably wount be noticable.
You need to realize that opening the connection to the db will take the largest portion of the processingtime. Executing the query will be a matter of 0.001 seconds or so (depending on tablesize, serverspecs etc).
What you absolutely need to avoid as much as possible are nested querys (where you loop through the first resultset and then run a new select for each record) and try to avoid subselects because they are usually less performant then joins.
Query-optimization can be verry hard, and there's a big difference between the best performing statement and the overall best sollution. Sometimes, the most optimized version of a query can even lead to performance-drops for other parts of your application (because tables are locked to long etc).
Also, be carefull with optimizations --> current hardware and db-servers are so fast that it's sometimes better to tolerate some inefficiency, if it makes your statements or code a lott simpler.
Mhtml
02-16-2006, 10:04 PM
seems ypu've already been drinking some (refering to your edit below) --> remember to let is drip slowly though your sugarcube!
lol, no not yet.. But it is my drink of choice for parties, and tomorrow is saturday so partytime! I'll finish a whole bottle in 4 hours easy, get's u totally messed up.
Thanks for all the info so far raf, you've been nothing short of outstanding! Especially to put up with my load of nonsense. :thumbsup:
ronaldb66
02-17-2006, 09:17 AM
Seems like I missed out on a whole lot here; damn time difference... :(
...tomorrow is saturday:
hmm. it's fridaymorning overhere so i've still got a whole workday to go :(
Thanks for all the info so far raf, you've been nothing short of outstanding! Especially to put up with my load of nonsense. :thumbsup:
You're welcome :thumbsup:
Seems like I missed out on a whole lot here; damn time difference...
you're in the same timezone as me, so you were probably just to drunk yesterday evening :D
Mhtml
02-17-2006, 10:03 AM
Don't worry Ronald, there will be plenty more to come..These problems were simple compared to my next lot :). And then, I have to optimize heaps of crap.
But that'll have to wait for another day, I'm just finishing some freelance work and I'm done for the week... Time to party!
http://www.mithoric.com/images/absinthyay.jpg
ronaldb66
02-17-2006, 12:45 PM
you were probably just to drunk yesterday evening That, and after spending a whole day at work gazing at a monitor, I don't always feel like getting behind one at home as well. :o
Anyway: it's past lunch on friday on this side of the globe; just a couple of hours and it's weekend for me. :thumbsup:
Mhtml,
Keep 'em coming: it's great fun trying to solve these conundrums of yours, and a good learning experience to boot.
vBulletin® v3.8.2, Copyright ©2000-2009, Jelsoft Enterprises Ltd.