Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 08-10-2006, 05:08 PM   PM User | #1
NancyJ
Senior Coder

 
NancyJ's Avatar
 
Join Date: Feb 2005
Location: Bradford, UK
Posts: 3,162
Thanks: 19
Thanked 65 Times in 64 Posts
NancyJ will become famous soon enough
Grouped Uniques & Duplicates

I know there are quite a few of you who are very knowledgable in things beyond, just making it work So I hope you can advise me with this.

We update our database from external data sources, but they often contain duplicates, the files we work with a fairly large - 100k-800k records per file. To remove the duplicates, what I have been doing is doing a select distinct into another table then taking everything out of that to put into the main table - this is slow and resource hoggy.
Now the idea I had (and have been trialling) is to instead, add a unique index containing all the fields that we output to the website, so the duplicates never get inserted into the database in the first place.

SO the index looks like
Code:
PRIMARY        PRIMARY 547154          fldOfferID
fldClientID_2 	UNIQUE 	547154  	fldClientID
                                                   fldDepartureDate
                                                   fldDestination
                                                   fldDepartureAPC
                                                   fldDestinationAPC
                                                   fldBoardBasis
                                                   fldDuration
                                                   fldPrice
                                                   fldOperatorsRating
                                                   fldAccomName
                                                   fldQuoteRef
                                                   fldAccomRef
fldDepartureDate  INDEX 	6754   fldDepartureDate
fldDestinationAPC INDEX 	32185  fldDestinationAPC
fldDepartureAPC  INDEX 	31  	fldDepartureAPC
fldDuration 	   INDEX 	31  		fldDuration
fldBoardBasis 	  INDEX 	14  	fldBoardBasis
fldClientID 	    INDEX 	10  		fldClientID
Can anyone advise me how this might effect performance - particularly of selects and joins etc.
__________________
http://www.hazelryan.co.uk
NancyJ is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:44 PM.


Advertisement
Log in to turn off these ads.