Find unique lines

Question

How can I find the unique lines and remove all duplicates from a file  My input file is   1 1 2 3 5 5 7 7   I would like the result to be   2 3   sort file   uniq will not do the job  Will show all values 1 time

User · Answer

uniq -u has been driving me crazy because it did not work.

So instead of that, if you have python (most Linux distros and servers already have it):

Assuming you have the data file in notUnique.txt

#Python
#Assuming file has data on different lines
#Otherwise fix split() accordingly.

uniqueData = []
fileData = open('notUnique.txt').read().split('\n')

for i in fileData:
  if i.strip()!='':
    uniqueData.append(i)

print uniqueData

###Another option (less keystrokes):
set(open('notUnique.txt').read().split('\n'))

Note that due to empty lines, the final set may contain '' or only-space strings. You can remove that later. Or just get away with copying from the terminal ;)

#

Just FYI, From the uniq Man page:

"Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use 'sort -u' without 'uniq'. Also, comparisons honor the rules specified by 'LC_COLLATE'."

One of the correct ways, to invoke with: # sort nonUnique.txt | uniq

Example run:

$ cat x
3
1
2
2
2
3
1
3

$ uniq x
3
1
2
3
1
3

$ uniq -u x
3
1
3
1
3

$ sort x | uniq
1
2
3

Spaces might be printed, so be prepared!

User · Answer

I find this easier  sort -u input filename  gt  output filename  -u stands for unique

User · Answer

Use as follows   sort  lt  filea   uniq  gt  fileb

User · Answer

you can use   sort data txt  uniq -u   this sort data and filter by unique values

User · Answer

This was the first i tried   skilla    uniq -u all sorted    76679787 76679787  76794979 76794979  76869286 76869286           After doing a cat -e all sorted  skilla    cat -e all sorted    76679787  76679787   76701427  76701427  76794979  76794979   76869286  76869286     Every second line has a trailing space    After removing all trailing spaces it worked   thank you

User · Answer

uniq -u  lt  file will do the job

User · Answer

uniq should do fine if you re file is can be sorted  if you can t sort the file for some reason you can use awk    awk   a  0    END for i in a if a i  lt 2 print i

User · Answer

sort -d  file name    uniq -u   this worked for me for a similar one  Use this if it is not arranged  You can remove sort if it is arranged

User · Answer

While sort takes O n log n   time  I prefer using  awk   seen  0         awk   seen  0     is an abbreviation for awk   seen  0     print    print line   0  if seen  0  is not zero  It take more space but only O n  time

User · Answer

You could also print out the unique value in  file  using the cat command by piping to sort and uniq  cat file   sort   uniq -u

User · Answer

uniq has the option you need      -u  --unique           only print unique lines       cat file txt 1 1 2 3 5 5 7 7   uniq -u file txt 2 3

[linux] Find unique lines

Assuming you have the data file in notUnique.txt

Note that due to empty lines, the final set may contain '' or only-space strings. You can remove that later. Or just get away with copying from the terminal ;)

Example run:

Spaces might be printed, so be prepared!

Examples related to linux

Examples related to sorting

Examples related to unique

Examples related to uniq