Select unique or distinct values from a list in UNIX shell script

Question

I have a ksh script that returns a long list of values  newline separated  and I want to see only the unique distinct values  It is possible to do this   For example  say my output is file suffixes in a directory    tar gz java gz java tar class class    I want to see a list like    tar gz java class

User · Answer

Unique  as requested   but not sorted   uses fewer system resources for less than  70 elements  as tested with time   written to take  input from stdin   or modify and include in another script    Bash   bag2set            Reduce a bag to a set      local -i i j n    a bag         for   i 0  i  lt  n  i      do         if    -n   a bag i       then             a set i    a bag i               a bag i     0              for   j i 1  j  lt  n  j      do                      a set i        a bag j       amp  amp  a bag j     0              done         fi     done   declare -a a bag    a set    stdin     lt  dev stdin   declare -i i 0 for e in  stdin  do     a bag i   e     i  i 1 done bag2set echo    a set

User · Answer

Pipe them through sort and uniq  This removes all duplicates   uniq -d gives only the duplicates  uniq -u gives only the unique ones  strips duplicates

User · Answer

With AWK you can do  I find it faster than sort     yourscript ksh   awk   a  0

User · Answer

You might want to look at the uniq and sort applications      yourscript ksh   sort   uniq    FYI  yes  the sort is necessary in this command line  uniq only strips duplicate lines that are immediately after each other   EDIT   Contrary to what has been posted by Aaron Digulla in relation to uniq s commandline options   Given the following input    class jar jar jar bin bin java   uniq will output all lines exactly once    class jar bin java   uniq -d will output all lines that appear more than once  and it will print them once    jar bin   uniq -u will output all lines that appear exactly once  and it will print them once    class java

User · Answer

I get a better tips to get non-duplicate entries in a file  awk   0    x   FOO   amp  amp  NR gt 1  print x   x  0  END  print   file name   uniq -f1 -u

User · Answer

With zsh you can do this     cat infile  tar more than one word gz java gz java tar class class zsh-5 0 0 t   print -l     fu    lt infile    tar more than one word gz java class   Or you can use AWK     awk      0     infile     tar more than one word gz java class

User · Answer

script sh   sort -u   This is the same as monoxide s answer  but a bit more concise

User · Answer

For larger data sets where sorting may not be desirable  you can also use the following perl script     yourscript ksh   perl -ne  if   defined  x        print      x       1       This basically just remembers every line output so that it doesn t output it again   It has the advantage over the  sort   uniq  solution in that there s no sorting required up front

[bash] Select unique or distinct values from a list in UNIX shell script

Examples related to bash

Examples related to unique

Examples related to distinct

Examples related to ksh

Examples related to sh