I know there are quite a few of you who are very knowledgable in things beyond, just making it work So I hope you can advise me with this.

We update our database from external data sources, but they often contain duplicates, the files we work with a fairly large - 100k-800k records per file. To remove the duplicates, what I have been doing is doing a select distinct into another table then taking everything out of that to put into the main table - this is slow and resource hoggy.
Now the idea I had (and have been trialling) is to instead, add a unique index containing all the fields that we output to the website, so the duplicates never get inserted into the database in the first place.

SO the index looks like
Code:
PRIMARY        PRIMARY 547154          fldOfferID
fldClientID_2 	UNIQUE 	547154  	fldClientID
                                                   fldDepartureDate
                                                   fldDestination
                                                   fldDepartureAPC
                                                   fldDestinationAPC
                                                   fldBoardBasis
                                                   fldDuration
                                                   fldPrice
                                                   fldOperatorsRating
                                                   fldAccomName
                                                   fldQuoteRef
                                                   fldAccomRef
fldDepartureDate  INDEX 	6754   fldDepartureDate
fldDestinationAPC INDEX 	32185  fldDestinationAPC
fldDepartureAPC  INDEX 	31  	fldDepartureAPC
fldDuration 	   INDEX 	31  		fldDuration
fldBoardBasis 	  INDEX 	14  	fldBoardBasis
fldClientID 	    INDEX 	10  		fldClientID
Can anyone advise me how this might effect performance - particularly of selects and joins etc.