Friday, November 7, 2008

Picking Random Rows from A Table

Being that I work for a number of restaurant chains, I get some interesting requests/requirements for extracting data that has been collected through various marketing campaigns or contests. A common request in the past has been to randomly pick a winner or randomly pick a list of winners from a particular set of data. This seems like it should be pretty easy to do, but I was never able to figure out how to pick a truly random set of records without looping. Until today, and the solution is really simple:


USE [tempdb]
GO
CREATE TABLE randomresults
(id INT IDENTITY PRIMARY KEY)
GO
INSERT INTO randomresults
DEFAULT VALUES
GO 100000 -- execute batch 100000 times
-- Return rows in a random order
SELECT TOP 10 id
FROM randomresults
ORDER BY checksum(NEWID())
GO


It doesn't matter how many times you run this, it will always return a completely random resultset.

3 comments:

  1. I wish i had something intelligent to say... but... that's pretty cool.

    ReplyDelete
  2. Note that this technique isn't statistically guaranteed to be "completely random".

    Please read this article for details
    http://msdn.microsoft.com/en-us/library/aa175776.aspx

    ReplyDelete
  3. Arnoud,

    That article does not cover this attempt and if you actually run the query code, you can not create a prediction as to what rows are being returned for any given execution. The CHECKSUM() hash of the NEWID() is not covered in your reference so I don't think its correct compare the two and come to the conclusion that you have.

    ReplyDelete