Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts

    First Thread! - MS SQL Index question

    I've got several tables with tens of thousands of rows in a SQL Server DB which I'm using a SELECT statment on with the IN clause.

    Something like "
    SELECT answer FROM tblAnswer WHERE responseID IN(1,2,3,4,5,6,7,8,12,15,22,27,32,...)"

    Well, there are sometime 1000s of things in the IN clause and it makes the query VERY slow.

    Some people have suggested adding an Index, but I'm not sure how to do that. I've tried a few things and they don't seem to speed up the query at all.

    Any help would be greatly appreciated.

    Thanks

    (I got the first thread! )

  • #2
    Regular Coder
    Join Date
    Jun 2002
    Location
    Cincinnati, OH
    Posts
    545
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Check out number #2 from this article:
    http://www.fawcette.com/vsm/2004_07/...atabasedesign/

    Basically using LEFT OUTER JOINS instead of WHERE IN on longer lists can improve performance.
    does this sig match?

  • #3
    raf
    raf is offline
    Master Coder
    Join Date
    Jul 2002
    Posts
    6,589
    Thanks
    0
    Thanked 0 Times in 0 Posts
    My first idea was indeed using a join.

    But for your case (i'm assuming that the respose8D is the PK for that table), i don't see why it should be an outer join. a regular inner join would do just fine.

    Creating an index will definitely speed up things. 'All' variables that you use to select or filter on should be numerical and indexed.

    But i fail to see the use of a query that will return thousand records. Doesn't sound like anything that could be usefull in a webenvironment, and if it's or offline processing, then the performance-issue shouldn't be so important.
    Posting guidelines I use to see if I will spend time to answer your question : http://www.catb.org/~esr/faqs/smart-questions.html

  • #4
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Actually, I don't think those will help in my situation. I posted about this earlier here: Maximizing techniques for web apps

    Basically I have a table s_results_multiple_analsis set up as follows:

    answerID int not null
    questionID int not null
    answerText text
    dateStamp datetime
    responseID int not null

    This table stores all the results from a surevey system I have created. When I display the results from the survey (in the admin interface) I run some simple "SELECT count(*) FROM s_results_multiple_analysis WHERE answerID=$a" queires for each answer in a loop. Where $a is the answerID in a PHP script.

    Now, as I mentioned in the previous thread (above) I also allow the user to click on one answer to see a "filtered" result set for people who answered that answer. To accomplish this, I first find the responseIDs for people who answered a question. Using "SELECT responseID FROM s_results_multiple_analysis WHERE answerID=533" and put this into my next query... "SELECT count(*) FROM s_results_multiple_analysis WHERE answerID=$a AND responseID IN (list from above)".

    This is where I'm getting the 100s - 1000s of items in the IN part and I need to know, should I be looking for an SQL solution, or is there a login problem? One of the surveys on the system currently has 2000 responses, at 16 questions, that means I've got 16 x 2000 = 32,000 rows in this table for that survey alone. If I'm having problems at 2000 responses, I'd hate to see where I'm at with 100,000 responses

    PS: You can see a demo version of this at http://24.118.25.194/~bcarl314/dev/survey/login.php. Username = demouser, passwd = demopass

  • #5
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts
    Well, I found one programming improvement that speeds up the results returned from over 5 minutes to just under 25 seconds.

    Basically it involves using the IN statement once for all questions rather than invoking that for each question.

    More improvements to come I hope.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •