View Full Version : Removing duplicate records can't use SELECT DISTINCT
kgjeremytw
03-19-2003, 07:59 PM
I'm looking to get input for how approach removing duplicate records from the results of an SQL Query. Any sample code would be greatly appreciated.
Here is an example
Columns
Offerings, Technologies, Roles
-Results From DB
Offering_1, technology_1, Role_1
Offering_1, technology_1, Role_2
Offering_1, technology_1, Role_3
Offering_1, technology_1, Role_4
Offering_1, technology_1, Role_5
-Results to User
Offering_1, technology_1
Offering_1, technology_1
Offering_1, technology_1
Offering_1, technology_1
Offering_1, technology_1
Since 'offerings to technologies' and 'technologies to roles' have a one to mulitiple relationship, when a user wants to search by technology all of the offerings and roles are returned resulting in duplicate results. The DB doesn't see them as duplicates so, SELECT DISTINCT doesn't work.
I'm not sure if I need to create a temporary table or is there another way for this to be done.
Roy Sinclair
03-19-2003, 09:08 PM
If you don't show Roles to the user why are you selecting that field? Without that field DISTICT would work. If you've got a reason for needing that field but only need one you could use the MIN or MAX function on the roles field to get the lowest or highest value without the rest of the values. Note that MIN or MAX will also require a GROUP BY clause listing the fields not used in the MIN or MAX function.
Spookster
03-20-2003, 10:48 AM
SELECT BLAH FROM YAKS WHERE FOO = 'BAR' LIMIT 1
but as Roy said don't select roles if you don't need it then just use DISTINCT.
kgjeremytw
03-20-2003, 01:02 PM
Thanks for your help.
The form that creates the query has an option to search by roles also. The user can use roles as a criteria for searching the DB which can get messy when they pick more than one role.
I will give your suggestions a try, thanks again. Jeremy
Roy Sinclair
03-20-2003, 03:21 PM
Try using a different query when you search by roles.
SELECT BLAH FROM YAKS WHERE FOO = 'BAR' LIMIT 1
Whats that "LIMIT 1" ? Never seen that + it doesn't work. (Or am i stupid ?)
Roy Sinclair
03-20-2003, 05:12 PM
I think the "LIMIT 1" may not be standard, however
"select top 1 * from mytable" is standard sql.
Spookster
03-20-2003, 06:14 PM
Originally posted by Roy Sinclair
I think the "LIMIT 1" may not be standard, however
"select top 1 * from mytable" is standard sql.
Well actually I believe that the keyword "LIMIT" became part of the standard in SQL3. The "TOP" keyword however I believe is proprietary to Microsoft database systems and not standard SQL.
Roy Sinclair
03-20-2003, 06:52 PM
Originally posted by Spookster
Well actually I believe that the keyword "LIMIT" became part of the standard in SQL3. The "TOP" keyword however I believe is proprietary to Microsoft database systems and not standard SQL.
Well TOP works with MS databases and with DB2 neither TOP nor LIMIT seems to work. I don't have easy access to our Oracle database right now so I can't say what may or may not work with it.
TOP is part of JetSQL (as Spookster indicates).
LIMIT indeed doesn't work with Microsoft databases (since it's not part of JetSQL).
And this is what my access2000 herlpfile says
[quote]
Mogelijkheden in ANSI SQL die niet worden ondersteund in Microsoft Jet SQL
Microsoft Jet SQL ondersteunt niet de volgende ANSI SQL-functies:
Verwijzingen met DISTINCT in statistische functies. Microsoft Jet SQL ondersteunt bijvoorbeeld niet SUM(DISTINCT kolomnaam).
De component LIMIT TO nn ROWS waarmee het aantal rijen wordt beperkt dat door een query als resultaat wordt gegeven. Het resultaat van een query kan alleen worden beperkt met de component WHERE.
/quote]
brief translation : ANSI SQL functions that are not supported by JetSQL
...
the component LIMIT TO nn ROWS which can be used to limit the number of records ... results of a query can only be limited with the WHERE component.
So you can use something like Spookster said in ANSI SQL.
Spookster
03-20-2003, 07:23 PM
I don't believe LIMIT nor TOP works with Oracle either. At lease not as of v.8 which is what I use at work. For Oracle I generally use the rownumber to limit the records. WHERE ROWNUMBER = 1.
It's kind of hard to say there is really a standard SQL as every database system decides whether or not they wish to support certain functions and clauses. In a given day I work with Oracle, mySQL, Access and msSQL so it does get confusing as to which functions work with what.
whammy
03-22-2003, 03:15 AM
That's all very good information. However I'm curious as to when you're trying to detect dupes.
Depending upon what you're trying to do it (such as insert data into a table making sure that you don't duplicate anything) it may be more complicated, requiring the use of a temporary table to use to compare data (and perhaps mark records that were duplicated in another table). Can you explain further?
Spookster
03-22-2003, 04:14 AM
Originally posted by whammy
That's all very good information. However I'm curious as to when you're trying to detect dupes.
Depending upon what you're trying to do it (such as insert data into a table making sure that you don't duplicate anything) it may be more complicated, requiring the use of a temporary table to use to compare data (and perhaps mark records that were duplicated in another table). Can you explain further?
If you don't wish to have duplicates then you set a primary key field and make it unique. In the case of msSQL or MS Access there is an option that allows you set it to allow duplicates or not.
vBulletin® v3.8.2, Copyright ©2000-2012, Jelsoft Enterprises Ltd.