[sql] How to randomly select rows in SQL?

I am using MSSQL Server 2005. In my db, I have a table "customerNames" which has two columns "Id" and "Name" and approx. 1,000 results.

I am creating a functionality where I have to pick 5 customers randomly every time. Can anyone tell me how to create a query which will get random 5 rows (Id, and Name) every time when query is executed?

This question is related to sql database random

The answer is


SELECT TOP 5 Id, Name FROM customerNames ORDER BY NEWID()

If you have a table with millions of rows and care about the performance, this could be a better answer:

SELECT * FROM Table1
WHERE (ABS(CAST(
  (BINARY_CHECKSUM
  (keycol1, NEWID())) as int))
  % 100) < 10

https://msdn.microsoft.com/en-us/library/cc441928.aspx


If you are using large table and want to access of 10 percent of the data then run this following command: SELECT TOP 10 PERCENT * FROM Table1 ORDER BY NEWID();


Maybe this site will be of assistance.

For those who don't want to click through:

SELECT TOP 1 column FROM table
ORDER BY NEWID()

This is an old question, but attempting to apply a new field (either NEWID() or ORDER BY rand()) to a table with a large number of rows would be prohibitively expensive. If you have incremental, unique IDs (and do not have any holes) it will be more efficient to calculate the X # of IDs to be selected instead of applying a GUID or similar to every single row and then taking the top X # of.

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];

DECLARE @randomId1 int, @randomId2 int, @randomId3 int, @randomId4 int, @randomId5 int
SET @randomId1 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId2 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId3 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId4 = ((@maxValue + 1) - @minValue) * Rand() + @minValue
SET @randomId5 = ((@maxValue + 1) - @minValue) * Rand() + @minValue

--select @maxValue as MaxValue, @minValue as MinValue
--  , @randomId1 as SelectedId1
--  , @randomId2 as SelectedId2
--  , @randomId3 as SelectedId3
--  , @randomId4 as SelectedId4
--  , @randomId5 as SelectedId5

select * from [TABLE] el
where el.id in (@randomId1, @randomId2, @randomId3, @randomId4, @randomId5)

If you wanted to select many more rows I would look into populating a #tempTable with an ID and a bunch of rand() values then using each rand() value to scale to the min-max values. That way you do not have to define all of the @randomId1...n parameters. I've included an example below using a CTE to populate the initial table.

DECLARE @NumItems int = 100;

DECLARE @minValue int;
DECLARE @maxValue int;
SELECT @minValue = min(id), @maxValue = max(id) from [TABLE];
DECLARE @range int = @maxValue+1 - @minValue;

with cte (n) as (
   select 1 union all
   select n+1 from cte
   where n < @NumItems
)
select cast( @range * rand(cast(newid() as varbinary(100))) + @minValue as int) tp
into #Nt
from cte;

select * from #Nt ntt
inner join [TABLE] i on i.id = ntt.tp;

drop table #Nt;

There is a nice Microsoft SQL Server 2005 specific solution here. Deals with the problem where you are working with a large result set (not the question I know).

Selecting Rows Randomly from a Large Table http://msdn.microsoft.com/en-us/library/cc441928.aspx


In case someone wants a PostgreSQL solution:

select id, name
from customer
order by random()
limit 5;

SELECT * FROM TABLENAME ORDER BY random() LIMIT 5; 

In order to shuffle the SQL result set, you need to use a database-specific function call.

Note that sorting a large result set using a RANDOM function might turn out to be very slow, so make sure you do that on small result sets.

If you have to shuffle a large result set and limit it afterward, then it's better to use something like the Oracle SAMPLE(N) or the TABLESAMPLE in SQL Server or PostgreSQL instead of a random function in the ORDER BY clause.

So, assuming we have the following database table:

Song database table

And the following rows in the song table:

| id | artist                          | title                              |
|----|---------------------------------|------------------------------------|
| 1  | Miyagi & ???????? ft. ??? ????? | I Got Love                         |
| 2  | HAIM                            | Don't Save Me (Cyril Hahn Remix)   |
| 3  | 2Pac ft. DMX                    | Rise Of A Champion (GalilHD Remix) |
| 4  | Ed Sheeran & Passenger          | No Diggity (Kygo Remix)            |
| 5  | JP Cooper ft. Mali-Koa          | All This Love                      |

Oracle

On Oracle, you need to use the DBMS_RANDOM.VALUE function, as illustrated by the following example:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY DBMS_RANDOM.VALUE

When running the aforementioned SQL query on Oracle, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| JP Cooper ft. Mali-Koa - All This Love            |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & ???????? ft. ??? ????? - I Got Love      |

Notice that the songs are being listed in random order, thanks to the DBMS_RANDOM.VALUE function call used by the ORDER BY clause.

SQL Server

On SQL Server, you need to use the NEWID function, as illustrated by the following example:

SELECT
    CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY NEWID()

When running the aforementioned SQL query on SQL Server, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| Miyagi & ???????? ft. ??? ????? - I Got Love      |
| JP Cooper ft. Mali-Koa - All This Love            |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |

Notice that the songs are being listed in random order, thanks to the NEWID function call used by the ORDER BY clause.

PostgreSQL

On PostgreSQL, you need to use the random function, as illustrated by the following example:

SELECT
    artist||' - '||title AS song
FROM song
ORDER BY random()

When running the aforementioned SQL query on PostgreSQL, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Miyagi & ???????? ft. ??? ????? - I Got Love      |

Notice that the songs are being listed in random order, thanks to the random function call used by the ORDER BY clause.

MySQL

On MySQL, you need to use the RAND function, as illustrated by the following example:

SELECT
  CONCAT(CONCAT(artist, ' - '), title) AS song
FROM song
ORDER BY RAND()

When running the aforementioned SQL query on MySQL, we are going to get the following result set:

| song                                              |
|---------------------------------------------------|
| HAIM - Don't Save Me (Cyril Hahn Remix)           |
| Ed Sheeran & Passenger - No Diggity (Kygo Remix)  |
| Miyagi & ???????? ft. ??? ????? - I Got Love      |
| 2Pac ft. DMX - Rise Of A Champion (GalilHD Remix) |
| JP Cooper ft. Mali-Koa - All This Love            |

Notice that the songs are being listed in random order, thanks to the RAND function call used by the ORDER BY clause.


I have found this to work best for big data.

SELECT TOP 1 Column_Name FROM dbo.Table TABLESAMPLE(1 PERCENT);

TABLESAMPLE(n ROWS) or TABLESAMPLE(n PERCENT) is random but need to add the TOP n to get the correct sample size.

Using NEWID() is very slow on large tables.


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to database

Implement specialization in ER diagram phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' cannot be loaded Room - Schema export directory is not provided to the annotation processor so we cannot export the schema SQL Query Where Date = Today Minus 7 Days MySQL Error: : 'Access denied for user 'root'@'localhost' SQL Server date format yyyymmdd How to create a foreign key in phpmyadmin WooCommerce: Finding the products in database TypeError: tuple indices must be integers, not str

Examples related to random

How can I get a random number in Kotlin? scikit-learn random state in splitting dataset Random number between 0 and 1 in python In python, what is the difference between random.uniform() and random.random()? Generate random colors (RGB) Random state (Pseudo-random number) in Scikit learn How does one generate a random number in Apple's Swift language? How to generate a random string of a fixed length in Go? Generate 'n' unique random numbers within a range What does random.sample() method in python do?