[sql] Generating random strings with T-SQL

When generating random data, specially for test, it is very useful to make the data random, but reproducible. The secret is to use explicit seeds for the random function, so that when the test is run again with the same seed, it produces again exactly the same strings. Here is a simplified example of a function that generates object names in a reproducible manner:

alter procedure usp_generateIdentifier
    @minLen int = 1
    , @maxLen int = 256
    , @seed int output
    , @string varchar(8000) output
as
begin
    set nocount on;
    declare @length int;
    declare @alpha varchar(8000)
        , @digit varchar(8000)
        , @specials varchar(8000)
        , @first varchar(8000)
    declare @step bigint = rand(@seed) * 2147483647;

    select @alpha = 'qwertyuiopasdfghjklzxcvbnm'
        , @digit = '1234567890'
        , @specials = '_@# '
    select @first = @alpha + '_@';

    set  @seed = (rand((@seed+@step)%2147483647)*2147483647);

    select @length = @minLen + rand(@seed) * (@maxLen-@minLen)
        , @seed = (rand((@seed+@step)%2147483647)*2147483647);

    declare @dice int;
    select @dice = rand(@seed) * len(@first),
        @seed = (rand((@seed+@step)%2147483647)*2147483647);
    select @string = substring(@first, @dice, 1);

    while 0 < @length 
    begin
        select @dice = rand(@seed) * 100
            , @seed = (rand((@seed+@step)%2147483647)*2147483647);
        if (@dice < 10) -- 10% special chars
        begin
            select @dice = rand(@seed) * len(@specials)+1
                , @seed = (rand((@seed+@step)%2147483647)*2147483647);
            select @string = @string + substring(@specials, @dice, 1);
        end
        else if (@dice < 10+10) -- 10% digits
        begin
            select @dice = rand(@seed) * len(@digit)+1
                , @seed = (rand((@seed+@step)%2147483647)*2147483647);
            select @string = @string + substring(@digit, @dice, 1);
        end
        else -- rest 80% alpha
        begin
            declare @preseed int = @seed;
            select @dice = rand(@seed) * len(@alpha)+1
                , @seed = (rand((@seed+@step)%2147483647)*2147483647);

            select @string = @string + substring(@alpha, @dice, 1);
        end

        select @length = @length - 1;   
    end
end
go

When running the tests the caller generates a random seed it associates with the test run (saves it in the results table), then passed along the seed, similar to this:

declare @seed int;
declare @string varchar(256);

select @seed = 1234; -- saved start seed

exec usp_generateIdentifier 
    @seed = @seed output
    , @string = @string output;
print @string;  
exec usp_generateIdentifier 
    @seed = @seed output
    , @string = @string output;
print @string;  
exec usp_generateIdentifier 
    @seed = @seed output
    , @string = @string output;
print @string;  

Update 2016-02-17: See the comments bellow, the original procedure had an issue in the way it advanced the random seed. I updated the code, and also fixed the mentioned off-by-one issue.

Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to tsql

Passing multiple values for same variable in stored procedure Count the Number of Tables in a SQL Server Database Change Date Format(DD/MM/YYYY) in SQL SELECT Statement Stored procedure with default parameters Format number as percent in MS SQL Server EXEC sp_executesql with multiple parameters SQL Server after update trigger How to compare datetime with only date in SQL Server Text was truncated or one or more characters had no match in the target code page including the primary key in an unpivot Printing integer variable and string on same line in SQL

Examples related to random

How can I get a random number in Kotlin? scikit-learn random state in splitting dataset Random number between 0 and 1 in python In python, what is the difference between random.uniform() and random.random()? Generate random colors (RGB) Random state (Pseudo-random number) in Scikit learn How does one generate a random number in Apple's Swift language? How to generate a random string of a fixed length in Go? Generate 'n' unique random numbers within a range What does random.sample() method in python do?