PDA

View Full Version : Eliminate rows in multiple table query


mertdaglaroglu
05-13-2009, 11:24 PM
I'm trying to make a query to get unique results and also eliminate some of them.

So I'm trying to make a system like last.fm structure.
members can add any content to their favorites (for example a song). and all songs can be linked by other songs. (A song linked by B song. like last.fm) Then if member add "A song" to his favorites, system suggest him "B song". but because off cross linking in database, results contain A and B song together. so system need to eliminate A song because of it's already in user's favorite.

My linked contents and favorites table same as following :


CREATE TABLE `content_link` (
`id` int(10) unsigned NOT NULL auto_increment,
`content_id` int(10) unsigned NOT NULL default '0',
`linked_id` int(10) unsigned NOT NULL default '0',
)



CREATE TABLE `content_favorites` (
`favorite_id` int(11) NOT NULL auto_increment,
`content_id` int(11) NOT NULL,
`members` int(11) NOT NULL,
)



I'm trying to use the line below to get results (in fact I tried many others but I couldnt achieve):

SELECT DISTINCT olink.content_id FROM content_link AS olink, content_favorites AS ofav WHERE ofav.content_id=olink.linked_id AND olove.members_id ='2'

Only that I need eliminate results which are same with "content_favorites"'s content_id. But I couldnt achieve this.(I tried inner, outer, right, left join combinations and my mind is outer joined now!)

I hope anyone has an idea on that


EDIT:

my tables content example like following :


CONTENT_LINK (I'm using cross linking for every linked items)
content_id :1 - linked_id : 3
content_id :3 - linked_id : 1
content_id :1 - linked_id : 5
content_id :5 - linked_id : 1

CONTENT_favorites
content_id :1 - members_id : 2
content_id :3 - members_id : 2

Old Pedant
05-14-2009, 12:50 AM
Try this:

SELECT DISTINCT olink.content_id
FROM content_link AS olink, content_favorites AS ofav
WHERE ofav.content_id = olink.linked_id
AND ofav.members_id = 2
AND olink.contents_id NOT IN (
SELECT content_id FROM content_favorites WHERE members_id = 2
)


You should *NOT* use '2' to match members_id if members_id is an INT field. You just cause the system to waste time converting the string to a number.

mertdaglaroglu
05-14-2009, 01:02 AM
oh my god! its working perfect!

Really I appreaciate to this. I've spent my whole day for this.

By the way thank you for your suggestion about INT values. I havent noticed before.

mertdaglaroglu
05-14-2009, 12:10 PM
Hello again. This case is solved but I need a little advise.

I'm setting up like a system as I told previous message. users adds to their favorites contents(songs) and contents linked with some of other contents.

So, There is a question like a splinter in my mind and I'm wondering how I can resolve that.

I've set up content_link and content_favorites tables. my next target is finding "user similarity".

I mean :
A user has 1,3,6,9 content_id in his content_favorites,
B user has 6,3,16,73 content_id in his content_favorites,
C user has 1,42,584,786 content_id in his content_favorites,

how can I find A and B, A and C user's similarity ratio?

do you suggest create a new table like :

CREATE TABLE `content_link` (
`members_id` int(10) unsigned NOT NULL,
`content_id` text(10) unsigned NOT NULL default '0',
)

in this way, "content_id" column holds all content_ids of user's :
members_id : 2
content_id : 1,3,6,9 (or it can be hold via serialize method)

afterwards, I should use MATCH AGAINST query for finding similarity


OR

content_favorites table is enough for this query?


I've made a research about finding similarity and match against query can do it as far as I know. But I haven't tried before a mysql system before and I need ideas of experienced minds :)

Old Pedant
05-14-2009, 08:10 PM
NEVER put a delimited list of values into a database field and expect to do reasonable queries against it!

You can do this all via content_favorites.

I assume you want to know the "similarity number" for one given member, the member who is looking at the page right now?

So just do:

SELECT CF1.members_id, COUNT(*) AS matchCount
FROM content_favorites AS CF1, content_favorites AS CF2
WHERE CF1.content_id = CF2.content_id
AND CF1.members_id <> CF2.members_id
AND CF2.members_id = 9999
GROUP BY CF1.members_id
ORDER BY matchCount DESC
LIMIT 10

You follow that? Replace 9999 with the actual members_id you want the answer for.

If you look at "CF2" alone, you can see that it will get all the records for user 9999, so it will get all the favorites for member 9999.

Then we join back to "CF1" (which is the same table, but by using a different alias it is treated as a *separate* table), looking to find all records with matching content_id (matching songs) but *not* from that same member.

Finally, we count all those matches and group them by the members_id.

The ORDER BY is arbitrary...you choose what you want.

The LIMIT is also arbitrary: This will give you the top 10 matches.

Make sense?