Working Around ORDER BY RANDOM()

Here’s what I’m doing to work around ORDER BY RANDOM() in PostgreSQL, which is generally slow:

  1. Get the maximum value for the primary key, eg, “SELECT MAX(id) FROM tbl”
  2. Get a random number within that range. In PHP, you would do “$rand_id = rand(1, $max_id);”
  3. Use the random number to fetch data, eg, “SELECT * FROM tbl WHERE id >= $max_id LIMIT 1″

I use “>=” as opposed to “=” to ensure that, if there’s any holes in the ID’s, it wont return an empty dataset.

Of course, this is only useful when obtaining a single record, as the starting position is the only thing that’s random. If you need any more records, you’d have to get a new random number and perform another query.

A quick and dirty benchmark shows a pretty good performance increase. On a 600 row table with PostgreSQL 8.2, ORDER BY RANDOM takes ~2.1ms, where the other method takes between 0.200ms and 0.300ms.

Optionally, you could replace “SELECT MAX(id) FROM tbl” with “SELECT last_value FROM tbl_seq” for a performance increase, assuming you know the name of the sequence for the table you’re working on.

One Response to “Working Around ORDER BY RANDOM()”

  1. Ian Eure Says:

    You can also use the pattern:

    SELECT * FROM foo LIMIT 1 OFFSET (RAND([1…ROW_COUNT]));

Leave a Reply