...

View Full Version : Slow inefficent query (FIXED)



Tynan
03-20-2007, 12:11 PM
Hello, rank stumbling newbie here

I have this query which pulls members from a table of 1200 members details, filtering out those without and email address and those that don't want Newsletters, and also any whose email address exists in a second table (emails that have hard bounced previously)

My first attempt took an age and returned about 10,000 members, clearly loads of repeated rows

My 'final' attempt with a DISTINCT shoved in works perfectly but is really slow, PHPAdmin says it takes 1.1s but on the website the data doesn't appear for over 30s, all the other sundry queries on the same table work nice and quick on the same website using the same php to generate the table of results

The googling I've done suggests (to me) that there's nothing wrong with the query but I suspect that there's far too much checking of fields going on from the pre DISTINCT query's results

Anyhow


SELECT DISTINCT `Email` , `Title` , `FirstName` , `LastName`
FROM All_Members, HB
WHERE All_Members.Email != HB.EmailHB
AND `ENewsletter` = 'y'
AND `Email` != '' ''


MYSQL Client 4.1.15
PHPAdmin 2.9.1.1

Thanks

SSJ
03-20-2007, 01:17 PM
There is nothing wrong with your query...

Tynan
03-20-2007, 01:28 PM
Thanks, that's good news I suppose

Does it make sense then that the query, without the DISTINCT, returns a vast number of rows? like 10,000

Does that suggest something not right in one of the tables?

30s to query a table with 1200 rows and a table with perhaps a 100 rows is a warning sign isn't it?

SSJ
03-20-2007, 01:42 PM
Does it make sense then that the query, without the DISTINCT, returns a vast number of rows? like 10,000

Does that suggest something not right in one of the tables?

30s to query a table with 1200 rows and a table with perhaps a 100 rows is a warning sign isn't it?

Really didn't get this...

Tynan
03-20-2007, 01:49 PM
Sorry

If I run that exact same query but without the DISTINCT in it, the query gives me over 10,000 results, ie loads of the rows repeated over and over again

That suggest to me that the query isn't at all ideal, I assume in my ignorance that the DISTINCT is just patching up a bad query and the right query would return the right amount of rows without having to use DISTINCT

So either the query isn't as good as it should be or I've done something silly in my dbs

In all of the reasoning above, I'm guessing and assuming

SSJ
03-20-2007, 01:54 PM
That's the power of DISTINCT keyword.

If you want un-repeated results then you have to use DISTINCT in your query

NancyJ
03-20-2007, 01:58 PM
If your description of what you want is correct then your query is wrong, you're asking for those that are NOT in the second table - but the way in which you're doing it is causing the 10k rows - how many rows do you have in the HB table?

Tynan
03-20-2007, 02:08 PM
Sorry Nancy, my slightly vague sentence I think, I'm filtering out any rows in All_Members that have an email address that exists in the HB tables (the query is defo right)

There's 382 'bad' email addresses in the HB table
There's 1,134 rows in the main All_Members table

Some of the email addresses fields in the All_Members table are dups where people share a common email address, the HB Emails are unique

I'm ecstatic about the way DISTINCT fixed the results, I just thought/think that there should be a query that brings back what I want cleanly without having to discard all the unwanted results using DISTINCT, so faster

guelphdad
03-20-2007, 02:13 PM
Your problem is here:

FROM All_Members, HB
WHERE All_Members.Email != HB.EmailHB


This is a cross join effect and will result in your query having to check every row for HB for each new row of All_Members.

so if All_members has 1000 rows and HB has 1000 rows then your cross join will have an effect of looking through 1 million rows.

You want this instead:


SELECT DISTINCT `Email` , `Title` , `FirstName` , `LastName`
FROM All_Members
WHERE
All_Members.Email NOT IN
(SELECT HB.EmailHB FROM HB)
AND `ENewsletter` = 'y'
AND `Email` != '' ''


One other note, is that DISTINCT works on all columns in your select statement. So if you have two rows with the same Title, Firstname and Lastname but two different email addresses, both rows are returned.

That is most likely what you want, but just pointing out that distinct does not work on just the one column.

Tynan
03-20-2007, 02:16 PM
aaaah!

I was dimly fumbling towards something like that and suspected soemthing horrid like that was happening

Thank-you and apologies to all for my despicable level of knowledge, trying to crawl before I'm born

Tynan
03-20-2007, 02:45 PM
That's in and working like a dream, down from 30s to almost instant

Got a few other suspect queries to rewrite now

thank-you very much all of you

not only does it work but I understand why too, I think I might have grasped the basics of the subquery too

:thumbsup:



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum