[command-line] how to show lines in common (reverse diff)?

I have a series of text files for which I'd like to know the lines in common rather than the lines which are different between them. Command line unix or windows is fine.

foo:

linux-vdso.so.1 =>  (0x00007fffccffe000)
libvlc.so.2 => /usr/lib/libvlc.so.2 (0x00007f0dc4b0b000)
libvlccore.so.0 => /usr/lib/libvlccore.so.0 (0x00007f0dc483f000)
libc.so.6 => /lib/libc.so.6 (0x00007f0dc44cd000)

bar:

libkdeui.so.5 => /usr/lib/libkdeui.so.5 (0x00007f716ae22000)
libkio.so.5 => /usr/lib/libkio.so.5 (0x00007f716a96d000)
linux-vdso.so.1 =>  (0x00007fffccffe000)

So, given these two files above the output of the desired utility would be akin to file1:line_number, file2:line_number == matching text (just a suggestion, I really don't care what the syntax is):

foo:1, bar:3 == linux-vdso.so.1 =>  (0x00007fffccffe000)

thanks.

This question is related to command-line diff

The answer is


Easiest way to do is :

awk 'NR==FNR{a[$1]++;next} a[$1] ' file1 file2

Files are not necessary to be sorted.


Just for information, i made a little tool for Windows doing the same thing than "grep -F -x -f file1 file2" (As i haven't found anything equivalent to this command on Windows)

Here it is : http://www.nerdzcore.com/?page=commonlines

Usage is "CommonLines inputFile1 inputFile2 outputFile"

Source code is also available (GPL)


In Windows you can use a Powershell Script with CompareObject

compare-object -IncludeEqual -ExcludeDifferent -PassThru (get-content A.txt) (get-content B.txt)> MATCHING.txt | Out-Null #Find Matching Lines

CompareObject:

  • IncludeEqual without -ExcludeDifferent : Everything
  • ExcludeDifferent without -InclueEqual : Nothing

Was asked here before: Unix command to find lines common in two files

You could also try with perl (credit goes here)

perl -ne 'print if ($seen{$_} .= @ARGV) =~ /10$/'  file1 file2

Found this answer on a question listed as a duplicate. I find grep to be more admin-friendly than comm, so if you just want the set of matching lines (useful for comparing CSVs, for instance) simply use

grep -F -x -f file1 file2

or the simplified fgrep version

fgrep -xf file1 file2

Plus, you can use file2* to glob and look for lines in common with multiple files, rather than just two.

Some other handy variations include

  • -n flag to show the line number of each matched line
  • -c to only count the number of lines that match
  • -v to display only the lines in file2 that differ (or use diff).

Using comm is faster, but that speed comes at the expense of having to sort your files first. It isn't very useful as a 'reverse diff'.


I just learned the comm command from this thread, but wanted to add something extra: if the files are not sorted, and you don't want to touch the original files, you can pipe the outptut of the sort command. This leaves the original files intact. Works in bash, I can't say about other shells.

comm -1 -2 <(sort file1) <(sort file2)

This can be extended to compare command output, instead of files:

comm -1 -2 <(ls /dir1 | sort) <(ls /dir2 | sort)