[bash] Select unique or distinct values from a list in UNIX shell script

I have a ksh script that returns a long list of values, newline separated, and I want to see only the unique/distinct values. It is possible to do this?

For example, say my output is file suffixes in a directory:

tar
gz
java
gz
java
tar
class
class

I want to see a list like:

tar
gz
java
class

This question is related to bash unique distinct ksh sh

The answer is


Pipe them through sort and uniq. This removes all duplicates.

uniq -d gives only the duplicates, uniq -u gives only the unique ones (strips duplicates).


./script.sh | sort -u

This is the same as monoxide's answer, but a bit more concise.


With zsh you can do this:

% cat infile 
tar
more than one word
gz
java
gz
java
tar
class
class
zsh-5.0.0[t]% print -l "${(fu)$(<infile)}"
tar
more than one word
gz
java
class

Or you can use AWK:

% awk '!_[$0]++' infile    
tar
more than one word
gz
java
class

Unique, as requested, (but not sorted);
uses fewer system resources for less than ~70 elements (as tested with time);
written to take input from stdin,
(or modify and include in another script):
(Bash)

bag2set () {
    # Reduce a_bag to a_set.
    local -i i j n=${#a_bag[@]}
    for ((i=0; i < n; i++)); do
        if [[ -n ${a_bag[i]} ]]; then
            a_set[i]=${a_bag[i]}
            a_bag[i]=$'\0'
            for ((j=i+1; j < n; j++)); do
                [[ ${a_set[i]} == ${a_bag[j]} ]] && a_bag[j]=$'\0'
            done
        fi
    done
}
declare -a a_bag=() a_set=()
stdin="$(</dev/stdin)"
declare -i i=0
for e in $stdin; do
    a_bag[i]=$e
    i=$i+1
done
bag2set
echo "${a_set[@]}"

With AWK you can do, I find it faster than sort

 ./yourscript.ksh | awk '!a[$0]++'

I get a better tips to get non-duplicate entries in a file

awk '$0 != x ":FOO" && NR>1 {print x} {x=$0} END {print}' file_name | uniq -f1 -u


For larger data sets where sorting may not be desirable, you can also use the following perl script:

./yourscript.ksh | perl -ne 'if (!defined $x{$_}) { print $_; $x{$_} = 1; }'

This basically just remembers every line output so that it doesn't output it again.

It has the advantage over the "sort | uniq" solution in that there's no sorting required up front.


Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to unique

Count unique values with pandas per groups Find the unique values in a column and then sort them How can I check if the array of objects have duplicate property values? Firebase: how to generate a unique numeric ID for key? pandas unique values multiple columns Select unique values with 'select' function in 'dplyr' library Generate 'n' unique random numbers within a range SQL - select distinct only on one column Can I use VARCHAR as the PRIMARY KEY? Count unique values in a column in Excel

Examples related to distinct

Using DISTINCT along with GROUP BY in SQL Server How to "select distinct" across multiple data frame columns in pandas? Laravel Eloquent - distinct() and count() not working properly together SQL - select distinct only on one column SQL: Group by minimum value in one field while selecting distinct rows Count distinct value pairs in multiple columns in SQL sql query distinct with Row_Number Eliminating duplicate values based on only one column of the table MongoDB distinct aggregation Pandas count(distinct) equivalent

Examples related to ksh

How can I find a file/directory that could be anywhere on linux command line? Shell Script: How to write a string to file and to stdout on console? Send password when using scp to copy files from one server to another Reading file line by line (with space) in Unix Shell scripting - Issue How to get the second column from command output? Get exit code for command in bash/ksh Check if file exists and whether it contains a specific string In a unix shell, how to get yesterday's date into a variable? How to set the From email address for mailx command? How to mkdir only if a directory does not already exist?

Examples related to sh

How to run a cron job inside a docker container? I just assigned a variable, but echo $variable shows something else How to run .sh on Windows Command Prompt? Shell Script: How to write a string to file and to stdout on console? How to cat <<EOF >> a file containing code? Assigning the output of a command to a variable What does set -e mean in a bash script? Get specific line from text file using just shell script Printing PDFs from Windows Command Line Ubuntu says "bash: ./program Permission denied"