What s an easy way to read random line from a file in Unix command line

Question

User · Answer

perlfaq5  How do I select a random line from a file  Here s a reservoir-sampling algorithm from the Camel Book   perl -e  srand  rand      lt  1  amp  amp    line       while  lt  gt   print  line   file   This has a significant advantage in space over reading the whole file in  You can find a proof of this method in The Art of Computer Programming  Volume 2  Section 3 4 2  by Donald E  Knuth

User · Answer

perlfaq5  How do I select a random line from a file  Here s a reservoir-sampling algorithm from the Camel Book   perl -e  srand  rand      lt  1  amp  amp    line       while  lt  gt   print  line   file   This has a significant advantage in space over reading the whole file in  You can find a proof of this method in The Art of Computer Programming  Volume 2  Section 3 4 2  by Donald E  Knuth

User · Answer

using a bash script      bin bash   replace with file to read FILE tmp txt   count number of lines NUM   wc - l  lt    FILE     generate random number in range 0-NUM let X   RANDOM      NUM    1   extract X-th line sed -n   X p   FILE

User · Answer

using a bash script      bin bash   replace with file to read FILE tmp txt   count number of lines NUM   wc - l  lt    FILE     generate random number in range 0-NUM let X   RANDOM      NUM    1   extract X-th line sed -n   X p   FILE

User · Answer

Single bash line   sed -n    1  RANDOM  wc -l test txt   cut -f 1 -d       p test txt   Slight problem  duplicate filename

User · Answer

Another way using  awk   awk NR       RANDOM     wc -l  lt  file name    1   file name

User · Answer

This is simple    cat file txt   shuf -n 1   Granted this is just a tad slower than the  shuf -n 1 file txt  on its own

User · Answer

Here s a simple Python script that will do the job   import random  sys lines   open sys argv 1   readlines   print lines random randrange len lines       Usage   python randline py file to get random line from

User · Answer

perlfaq5  How do I select a random line from a file  Here s a reservoir-sampling algorithm from the Camel Book   perl -e  srand  rand      lt  1  amp  amp    line       while  lt  gt   print  line   file   This has a significant advantage in space over reading the whole file in  You can find a proof of this method in The Art of Computer Programming  Volume 2  Section 3 4 2  by Donald E  Knuth

User · Answer

This is simple    cat file txt   shuf -n 1   Granted this is just a tad slower than the  shuf -n 1 file txt  on its own

User · Answer

using a bash script      bin bash   replace with file to read FILE tmp txt   count number of lines NUM   wc - l  lt    FILE     generate random number in range 0-NUM let X   RANDOM      NUM    1   extract X-th line sed -n   X p   FILE

User · Answer

bin bash  IFS    n  wordsArray     lt  1    numWords    wordsArray     sizeOfNumWords    numWords   while   True   do     for   i 0  i lt  sizeOfNumWords  i         do         let ranNumArray  i         RANDOM   10      1   -1         ranNumStr   ranNumStr  ranNumArray  i        done     if    ranNumStr -le  numWords       then         break     fi     ranNumStr    done  noLeadZeroStr    10  ranNumStr   echo   wordsArray  noLeadZeroStr

User · Answer

A solution that also works on MacOSX  and should also works on Linux      N 5 awk  NR  FNR  lineN  1   next  FNR in lineN    lt  jot -r  N 1   wc -l  lt   file    file    Where    N is the number of random lines you want NR  FNR  lineN  1   next  FNR in lineN  file1 file2 --  save line numbers written in file1 and then print corresponding line in file2 jot -r  N 1   wc -l  lt   file  --  draw N numbers randomly  -r  in range  1  number of line in file  with jot  The process substitution  lt    will make it look like a file for the interpreter  so file1 in previous example

User · Answer

Single bash line   sed -n    1  RANDOM  wc -l test txt   cut -f 1 -d       p test txt   Slight problem  duplicate filename

User · Answer

perlfaq5  How do I select a random line from a file  Here s a reservoir-sampling algorithm from the Camel Book   perl -e  srand  rand      lt  1  amp  amp    line       while  lt  gt   print  line   file   This has a significant advantage in space over reading the whole file in  You can find a proof of this method in The Art of Computer Programming  Volume 2  Section 3 4 2  by Donald E  Knuth

User · Answer

Here is what I discovery since my Mac OS doesn t use all the easy answers  I used the jot command to generate a number since the  RANDOM variable solutions seems not to be very random in my test   When testing my solution I had a wide variance in the solutions provided in the output     RANDOM1  jot -r 1 1 235886      range of jot   1 235886   found from earlier wc -w  usr share dict web2    echo  RANDOM1    head -n  RANDOM1  usr share dict web2   tail -n 1   The echo of the variable is to get a visual of the generated random number

User · Answer

Single bash line   sed -n    1  RANDOM  wc -l test txt   cut -f 1 -d       p test txt   Slight problem  duplicate filename

User · Answer

using a bash script      bin bash   replace with file to read FILE tmp txt   count number of lines NUM   wc - l  lt    FILE     generate random number in range 0-NUM let X   RANDOM      NUM    1   extract X-th line sed -n   X p   FILE

User · Answer

You can use shuf   shuf -n 1  FILE   There is also a utility called rl  In Debian it s in the randomize-lines package that does exactly what you want  though not available in all distros   On its home page it actually recommends the use of shuf instead  which didn t exist when it was created  I believe    shuf is part of the GNU coreutils  rl is not   rl -c 1  FILE

User · Answer

You can use shuf   shuf -n 1  FILE   There is also a utility called rl  In Debian it s in the randomize-lines package that does exactly what you want  though not available in all distros   On its home page it actually recommends the use of shuf instead  which didn t exist when it was created  I believe    shuf is part of the GNU coreutils  rl is not   rl -c 1  FILE

User · Answer

Single bash line   sed -n    1  RANDOM  wc -l test txt   cut -f 1 -d       p test txt   Slight problem  duplicate filename

User · Answer

Another alternative   head -     RANDOM     wc -l  lt  file    1   file   tail -1

User · Answer

Another alternative   head -     RANDOM     wc -l  lt  file    1   file   tail -1

User · Answer

Here s a simple Python script that will do the job   import random  sys lines   open sys argv 1   readlines   print lines random randrange len lines       Usage   python randline py file to get random line from

User · Answer

Another alternative   head -     RANDOM     wc -l  lt  file    1   file   tail -1

User · Answer

Another way using  awk   awk NR       RANDOM     wc -l  lt  file name    1   file name

User · Answer

You can use shuf   shuf -n 1  FILE   There is also a utility called rl  In Debian it s in the randomize-lines package that does exactly what you want  though not available in all distros   On its home page it actually recommends the use of shuf instead  which didn t exist when it was created  I believe    shuf is part of the GNU coreutils  rl is not   rl -c 1  FILE

User · Answer

Using only vanilla sed and awk  and without using  RANDOM  a simple  space-efficient and reasonably fast  one-liner  for selecting a single line pseudo-randomly from a file named FILENAME is as follows   sed -n   awk  END  srand    r rand   NR  if  r lt NR   sub           r   r      print r   FILENAME p FILENAME    This works even if FILENAME is empty  in which case no line is emitted    One possible advantage of this approach is that it only calls rand   once   As pointed out by  AdamKatz in the comments  another possibility would be to call rand   for each line   awk  rand     NR  lt  1   line    0   END   print line    FILENAME    A simple proof of correctness can be given based on induction    Caveat about rand     In most awk implementations  including gawk  rand   starts generating numbers from the same starting number  or seed  each time you run awk    -- https   www gnu org software gawk manual html node Numeric-Functions html

User · Answer

Here s a simple Python script that will do the job   import random  sys lines   open sys argv 1   readlines   print lines random randrange len lines       Usage   python randline py file to get random line from

User · Answer

sort --random-sort  FILE   head -n 1    I like the shuf approach above even better though - I didn t even know that existed and I would have never found that tool on my own

User · Answer

sort --random-sort  FILE   head -n 1    I like the shuf approach above even better though - I didn t even know that existed and I would have never found that tool on my own

User · Answer

A solution that also works on MacOSX  and should also works on Linux      N 5 awk  NR  FNR  lineN  1   next  FNR in lineN    lt  jot -r  N 1   wc -l  lt   file    file    Where    N is the number of random lines you want NR  FNR  lineN  1   next  FNR in lineN  file1 file2 --  save line numbers written in file1 and then print corresponding line in file2 jot -r  N 1   wc -l  lt   file  --  draw N numbers randomly  -r  in range  1  number of line in file  with jot  The process substitution  lt    will make it look like a file for the interpreter  so file1 in previous example

User · Answer

Using only vanilla sed and awk  and without using  RANDOM  a simple  space-efficient and reasonably fast  one-liner  for selecting a single line pseudo-randomly from a file named FILENAME is as follows   sed -n   awk  END  srand    r rand   NR  if  r lt NR   sub           r   r      print r   FILENAME p FILENAME    This works even if FILENAME is empty  in which case no line is emitted    One possible advantage of this approach is that it only calls rand   once   As pointed out by  AdamKatz in the comments  another possibility would be to call rand   for each line   awk  rand     NR  lt  1   line    0   END   print line    FILENAME    A simple proof of correctness can be given based on induction    Caveat about rand     In most awk implementations  including gawk  rand   starts generating numbers from the same starting number  or seed  each time you run awk    -- https   www gnu org software gawk manual html node Numeric-Functions html

User · Answer

bin bash  IFS    n  wordsArray     lt  1    numWords    wordsArray     sizeOfNumWords    numWords   while   True   do     for   i 0  i lt  sizeOfNumWords  i         do         let ranNumArray  i         RANDOM   10      1   -1         ranNumStr   ranNumStr  ranNumArray  i        done     if    ranNumStr -le  numWords       then         break     fi     ranNumStr    done  noLeadZeroStr    10  ranNumStr   echo   wordsArray  noLeadZeroStr

User · Answer

Here s a simple Python script that will do the job   import random  sys lines   open sys argv 1   readlines   print lines random randrange len lines       Usage   python randline py file to get random line from

User · Answer

You can use shuf   shuf -n 1  FILE   There is also a utility called rl  In Debian it s in the randomize-lines package that does exactly what you want  though not available in all distros   On its home page it actually recommends the use of shuf instead  which didn t exist when it was created  I believe    shuf is part of the GNU coreutils  rl is not   rl -c 1  FILE

User · Answer

Here is what I discovery since my Mac OS doesn t use all the easy answers  I used the jot command to generate a number since the  RANDOM variable solutions seems not to be very random in my test   When testing my solution I had a wide variance in the solutions provided in the output     RANDOM1  jot -r 1 1 235886      range of jot   1 235886   found from earlier wc -w  usr share dict web2    echo  RANDOM1    head -n  RANDOM1  usr share dict web2   tail -n 1   The echo of the variable is to get a visual of the generated random number

User · Answer

Another alternative   head -     RANDOM     wc -l  lt  file    1   file   tail -1

[linux] What's an easy way to read random line from a file in Unix command line?

Examples related to linux

Examples related to unix

Examples related to random

Examples related to command-line