Sort uniq in Linux shell

Question

What is the difference between the following to commands   sort -u FILE  sort FILE   uniq

User · Accepted Answer

Using sort -u does less I O than sort   uniq  but the end result is the same   In particular  if the file is big enough that sort has to create intermediate files  there s a decent chance that sort -u will use slightly fewer or slightly smaller intermediate files as it could eliminate duplicates as it is sorting each set  If the data is highly duplicative  this could be beneficial  if there are few duplicates in fact  it won t make much difference  definitely a second order performance effect  compared to the first order effect of the pipe    Note that there times when the piping is appropriate   For example   sort FILE   uniq -c   sort -n   This sorts the file into order of the number of occurrences of each line in the file  with the most repeated lines appearing last   It wouldn t surprise me to find that this combination  which is idiomatic for Unix or POSIX  can be squished into one complex  sort  command with GNU sort    There are times when not using the pipe is important   For example   sort -u -o FILE FILE   This sorts the file  in situ   that is  the output file is specified by -o FILE  and this operation is guaranteed safe  the file is read before being overwritten for output

User · Answer

There is one slight difference  return code   The thing is that unless shopt -o pipefail is set the return code of the piped command will be return code of the last one   And uniq always returns zero  success    Try examining exit code  and you ll see something like this  pipefail is not set here    pavel lonely     sort -u file that doesnt exist   echo    sort  open failed  file that doesnt exist  No such file or directory 2 pavel lonely     sort file that doesnt exist   uniq   echo    sort  open failed  file that doesnt exist  No such file or directory 0   Other than this  the commands are equivalent

User · Answer

sort -u will be slightly faster  because it does not need to pipe the output between two commands  also see my question on the topic  calling uniq and sort in different orders in shell

User · Answer

I have worked on some servers where sort don t support  -u  option  there we have to use   sort xyz   uniq

User · Answer

Nothing  they will produce the same result

User · Answer

Beware  While it s true that  sort -u  and  sort uniq  are equivalent  any additional options to sort can break the equivalence   Here s an example from the coreutils manual   For example   sort -n -u  inspects only the value of the initial numeric string when checking for uniqueness  whereas  sort -n   uniq  inspects the entire line    Similarly  if you sort on key fields  the uniqueness test used by sort won t necessarily look at the entire line anymore   After being bitten by that bug in the past  these days I tend to use  sort uniq  when writing Bash scripts   I d rather have higher I O overhead than run the risk that someone else in the shop won t know about that particular pitfall when they modify my code to add additional sort parameters

[linux] Sort & uniq in Linux shell

Examples related to linux

Examples related to shell

Examples related to sorting

Examples related to uniq