View Full Version : Search engine with random
04-17-2011, 07:53 AM
I don't know where is right place of this topic.
I don't need code, I can write it myself.
I need only algorithm.
I want to create search engine with random, but there are some problems. Search result may be great than 50 000 or more, also there is "load more" function. So I need remember the results which are already displayed, because when user clicks "load more" button the script may be displayed same results.
"SELECT * FROM table WHERE id NOT IN ($displayed)"
But when user loads 10 000 results, array elements count will be more than 200 000.
Is there any better way?
Thanks a lot.
04-17-2011, 08:51 PM
You really expect some human being to load and look at 10,000 results?????
In general, I try to never show a user more than a screen full of data. So no more than 10 to 20 records at a time, though I suppose you might get to 100 with some kinds of displays.
Yet, 10000 record ids at, say, 6 characters each, is a lot, but not overwhelming. I don't think that's a really bad solution, at all.
You will have to use <form method=post> of course, as that's way too much data for a query string, but I can't see that this is bad.
04-18-2011, 06:57 AM
Sounds like your search results already have an ID associated with them, and depending on how random you need it, or whether it needs to be random between users and sessions, you can approach it slightly differently.
The way I would do it is store the search ID, order ID and row ID in a table, then paginate that. Then the user's session can match a search ID, the order ID would be what it sorts by and the row ID will point to the result you want to display.
Depending on how random it can be, you can reuse the search ID in a way by creating a new table when the time comes, and give new users the results stored in the new table, and when it comes time to rotate again (the old results expire), you would rotate the tables and drop the oldest one. The reason behind this is due to the cost of a DELETE operation on a large amount of rows in a single table. Creating and dropping the tables or truncating them is faster than selectively deleteing rows, though you can use InnoDB for the table and not care, but InnoDB will probably be slower for this operation compared to using MyISAM (assuming your using MySQL).
04-18-2011, 06:36 PM
Wojjie: And what happens if you have 3,875 users? You would create a temporary table, in random order, for each one of them???? Not practical.
04-18-2011, 07:03 PM
The results per table is if you reuse those results for all users, not per user, and the second table is for new results for newer users or users that had their results expire.
If you want random results for EACH user, you use a single table, or you can use the same two table setup, so you don't have to delete expired results, instead you would drop the second, older table, create a new table and rename the remaining table to be the second table.
At no point do you have more than 2 tables at a time.
table a: new results
table b: old results and the brink of being deleted when table a rotates to table b
04-18-2011, 07:20 PM
Okay, if his specifications would allow it, that's a reasonable scheme.
But I can easily envision specs that wouldn't allow that.
04-18-2011, 07:26 PM
It can easily be modified to follow most specs, except where there is no ID relating to the row that needs to be displayed.
The main problem with search results is speed, and you need to cache it to a table, and this is the least costly way to DELETE old rows.