Quote:
Originally Posted by Philip M
I have the idea in my mind that this sort function is biased. I may be wrong. Felgall may comment.
|
Yes - I ran tens of millions of tests of that sort code and found a definite bias toward values staying in the same place. You need to use a value of about 0.7 something to get closer to an unbiased result - 0.7 was still biased toward staying in place and 0.8 was biased the other way and I decided to not bother trying to run the trillions of tests that would be necessary to properly narrow it down further. Presumably there would be a way to calculate the value that would produce an unbiased result with that formula but it is easier to just replace it with a shuffle instead.
With the code I posted above that initially displayed a bias to the first two numbers (each of which appeared several times before a 5 appeared) I just kept refreshing the page and before long was getting a more even result - you would expect that you might sometimes get a bias like that with a properly random function and only when you check a large number of times would you expect the appearances of each to start to even out. Since there is nothing in that code that I know of that could introduce a bias to the results I can't see any point in setting up an automated test to record the outcomes of millions of tests the way I did to check whether the sort random function. I hadn't initially realised that the random sort was biased until someone suggested to me that a random compare of pairs of entries did not necessarily lead to random results being returned from the sort - it was then that I set up an automated test that sorted 10 numbers a million times and recorded how many times each number ended up in each position - which I then ran about a dozen times just to check that the bias that one test showed wasn't a fluke. I then started increasing the value from 0.5 to 0.6 and then 0.7 and 0.8 to see that the bias gradually reduced and then reversed. By that point I'd manually run the test about 50 times with each test running a million sorts.