PDA

View Full Version : First Thread! - MS SQL Index question



bcarl314
07-30-2004, 02:45 PM
I've got several tables with tens of thousands of rows in a SQL Server DB which I'm using a SELECT statment on with the IN clause.

Something like "
SELECT answer FROM tblAnswer WHERE responseID IN(1,2,3,4,5,6,7,8,12,15,22,27,32,...)"

Well, there are sometime 1000s of things in the IN clause and it makes the query VERY slow.

Some people have suggested adding an Index, but I'm not sure how to do that. I've tried a few things and they don't seem to speed up the query at all.

Any help would be greatly appreciated.

Thanks

(I got the first thread! :p) :D

allida77
07-30-2004, 03:13 PM
Check out number #2 from this article:
http://www.fawcette.com/vsm/2004_07/magazine/columns/databasedesign/

Basically using LEFT OUTER JOINS instead of WHERE IN on longer lists can improve performance.

raf
07-30-2004, 03:22 PM
My first idea was indeed using a join.

But for your case (i'm assuming that the respose8D is the PK for that table), i don't see why it should be an outer join. a regular inner join would do just fine.

Creating an index will definitely speed up things. 'All' variables that you use to select or filter on should be numerical and indexed.

But i fail to see the use of a query that will return thousand records. Doesn't sound like anything that could be usefull in a webenvironment, and if it's or offline processing, then the performance-issue shouldn't be so important.

bcarl314
07-30-2004, 04:59 PM
Actually, I don't think those will help in my situation. I posted about this earlier here: http://www.codingforums.com/showthread.php?t=42431

Basically I have a table s_results_multiple_analsis set up as follows:

answerID int not null
questionID int not null
answerText text
dateStamp datetime
responseID int not null

This table stores all the results from a surevey system I have created. When I display the results from the survey (in the admin interface) I run some simple "SELECT count(*) FROM s_results_multiple_analysis WHERE answerID=$a" queires for each answer in a loop. Where $a is the answerID in a PHP script.

Now, as I mentioned in the previous thread (above) I also allow the user to click on one answer to see a "filtered" result set for people who answered that answer. To accomplish this, I first find the responseIDs for people who answered a question. Using "SELECT responseID FROM s_results_multiple_analysis WHERE answerID=533" and put this into my next query... "SELECT count(*) FROM s_results_multiple_analysis WHERE answerID=$a AND responseID IN (list from above)".

This is where I'm getting the 100s - 1000s of items in the IN part and I need to know, should I be looking for an SQL solution, or is there a login problem? One of the surveys on the system currently has 2000 responses, at 16 questions, that means I've got 16 x 2000 = 32,000 rows in this table for that survey alone. If I'm having problems at 2000 responses, I'd hate to see where I'm at with 100,000 responses :eek:

PS: You can see a demo version of this at http://24.118.25.194/~bcarl314/dev/survey/login.php. Username = demouser, passwd = demopass

bcarl314
07-30-2004, 08:54 PM
Well, I found one programming improvement that speeds up the results returned from over 5 minutes to just under 25 seconds.

Basically it involves using the IN statement once for all questions rather than invoking that for each question.

More improvements to come I hope.