...

View Full Version : Faster SQL check for duplicate/insert query?



Darksbane
01-30-2003, 05:40 PM
I am currently working with ASP and MS SQL using this query to check for duplicate entries in the database and insert if they don't exist.
---------------------------------------------------------------------
SQL = "IF EXISTS(SELECT 'True' FROM "& ptype &" WHERE Id_no = '" & Id_no & "') "
SQL = SQL & "BEGIN "
SQL = SQL & "SELECT 'This Id_no already exists!', pallet_no, time_stamp FROM " & ptype & " Where Id_no = '"& Id_no & "' "

SQL = SQL & "END ELSE BEGIN "
SQL = SQL & "SELECT 'Record Added' "
SQL = SQL & "INSERT INTO "& ptype &" (pallet_no, Id_no,"

SQL = SQL & " time_stamp) VALUES ("
SQL=SQL & "'" & pal & "', "
SQL=SQL & "'"& Id_no & "', "
SQL=SQL & "GETDATE()) "
SQL = SQL & "END"
-------------------------------------------------------------------------------

The problem comes when I have ~350 thousand entries in the table the search time lengthens from an acceptable 1-2 seconds to an unacceptable 4-5 seconds. Does anyone know offhand if there is a way to speed up this query without sacrificing accuracy. This might be impossible to do but my company would prefer not to see this kind of slowdown until about 1 million records are being searched. The field being searched on (Id_no) is the key and is a character sequence between 10 and 15 characters.

ecnarongi
01-30-2003, 10:00 PM
are they numbers in sequential order? if so you can use a binary search scheme. just a thought.

whammy
01-31-2003, 12:04 AM
I would look instead at using a DTS package for this kind of mass insert - you could create a temporary table with all the data to be inserted, and then compare this temporary table with your "live" table, and mark some "extra" field in the temporary table with an "X" or something where it is duplicated.

Then, in the next step, you could insert all of the records that were NOT marked with an "X" in the temporary table into your live table. Another advantage of using a DTS package to do this is you can schedule it to run automatically. Not to mention they run REALLY fast.

You can find more information on DTS packages by doing a google search. :)

Darksbane
02-13-2003, 03:50 PM
"are they numbers in sequential order? if so you can use a binary search scheme. just a thought."

Unfortunately no, the input is actually an ID barcode consisting of letters and number in no paritcular order...


"I would look instead at using a DTS package for this kind of mass insert "

unfortunately it isnt all inserted at once. Each number is inserted individually and must return whether it is a duplicate immedately after it has been scanned. This is critical because if a number has been duplicated it must be immediately removed and replaced by product with a good number.

Again any help or ideas would be appreciated.

Roy Sinclair
02-13-2003, 04:10 PM
Have you checked and made sure you have indexes created on the table? Without indexes you get a slow search, create an index on the key field(s) you're looking for and your table should work much quicker.

Secondarily, you should be using a stored procedure for your inserts with all the duplicate checking logic in the procedure instead of having that in your code.

Darksbane
02-13-2003, 04:22 PM
Originally posted by Roy Sinclair
Have you checked and made sure you have indexes created on the table? Without indexes you get a slow search, create an index on the key field(s) you're looking for and your table should work much quicker.

...Ok this is where my lack of DB knowledge shows up. I have been trying to read up on indexing but have not really found any articles which explain to me what it is and how it decreases search time. Depending on the type of product being scanned in the tables will either have 1 or 2 fields which need to be unique (each individually unique, so if either one is duplicated I can't insert). Does anyone know of any good places to read up on this.



Originally posted by Roy Sinclair
Secondarily, you should be using a stored procedure for your inserts with all the duplicate checking logic in the procedure instead of having that in your code.

Again not something I am famailiar with. I'll do some looking into it but if you have any good sites to learn it at let me know.

Roy Sinclair
02-13-2003, 06:50 PM
Do you have a Database Administrator in charge of things or are you filling all the roles?

A good DBA would know these things and be able to show you exactly how and where to do both of these.

I don't have any online resources I use for that information since I learned much of it over a series of years of experience with several different database products. I can readily find what I want in the SQL Server Online handbook which comes with SQL Server. Someone else here though ought to be able to give references to their favorite resources.

Darksbane
02-13-2003, 06:54 PM
Originally posted by Roy Sinclair
Have you checked and made sure you have indexes created on the table? Without indexes you get a slow search, create an index on the key field(s) you're looking for and your table should work much quicker.

OK I figured out allittle bit more how indexing works. It seems that when I index a field it inserts everything in sequential order. I am confused though when I index 2 fields, the first field is still put in order but the second is still out of order. How does it then make searching the second field faster?

Roy Sinclair
02-13-2003, 07:00 PM
When you index on multiple fields you should index them in the order of importance. I notice that your basic problem is preventing duplicate entries, the wrong way to handle that is by adding an entry checking to see if it's a duplicate and then deleting it. The right way is to set up and index on the key fields which cannot be allowed to duplicate and setting a uniqueness constraint on them which will prevent the adding of duplicate records. You then need to make sure you've got error handling turned on so you can handle an attempt to add a duplicate without leaving your user looking at an error message where your script crashed.

Darksbane
02-13-2003, 07:03 PM
Originally posted by Roy Sinclair
Do you have a Database Administrator in charge of things or are you filling all the roles?


I'm pretty much in charge of this, and have limited experience with databases (in general) as large as this one. My employer found out I know how to code and know alittle about SQL and dropped this in my lap about a week after I started. I got the project done but unfortunately am running into the above problems (which I expected to happen) where I am shaky on DB stuff. On the upside though I guess I am learning quite a bit :)

Roy Sinclair
02-13-2003, 08:13 PM
I'd seriously look into a book on Administering a SQL Server, the things you need to know are simply too much for handling on a board like this.

Darksbane
02-13-2003, 09:27 PM
Thanks for the advice, I appreciate it. I'll look into getting a book this weekend.



the wrong way to handle that is by adding an entry checking to see if it's a duplicate and then deleting it. The right way is to set up and index on the key fields which cannot be allowed to duplicate and setting a uniqueness constraint on them which will prevent the adding of duplicate records. You then need to make sure you've got error handling turned on so you can handle an attempt to add a duplicate without leaving your user looking at an error message where your script crashed.


Is something more like this what you were talking about above

=============================================
SQL = "INSERT INTO Test1 (Id_no,Sub_no) values ('00000000004','TF5012RD0z')"


DBconn.Errors.Clear ' clear out any existing errors

On Error Resume Next ' then IGNORE errors!
set RS = DBconn.Execute(SQL)
On Error GoTo 0 ' turn off the ignoring of errors!

' now see if we got any errors from the insert!

For Each oops In DBconn.Errors
If oops.number = -2147217900 Then
Response.Write "That item already exists in that table!<BR>"
Else
Response.Write "Unexpected error: " & oops.number & " -- " & oops.description
End If
Next
==============================================

after I index all my tables this should prevent duplicates and catch the error while only trying to insert it, so far in my test DB it seems to be working. Hopefully it will also make it work alittle faster :)

Thanks again

Drew

Roy Sinclair
02-13-2003, 09:34 PM
Thay looks like a good start, when you get around to doing a stored procedure you can have the stored procedure return the text value and leave out the error checking entirely.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum