[linux] Fastest way of finding differences between two files in unix?

I want to find the difference between two files and then put only the differences in a third file. I saw different approaches using awk, diff and comm. Are there any more ?

eg.Compare two files line by line and generate the difference in another file

eg.Copy differences between two files in unix

I need to know which is the fastest way of finding all the differences and listing them in a file for each of the cases below -

Case 1 - file2 = file1 + extra text appended.
Case 2 - file2 and file1 are different.

This question is related to linux bash unix

The answer is


You could try..

comm -13 <(sort file1) <(sort file2) > file3

or

grep -Fxvf file1 file2 > file3

or

diff file1 file2 | grep "<" | sed 's/^<//g'  > file3

or

join -v 2 <(sort file1) <(sort file2) > file3

Another option:

sort file1 file2 | uniq -u > file3

If you want to see just the duplicate entries use "uniq -d" option:

sort file1 file2 | uniq -d > file3

You could also try to include md5-hash-sums or similar do determine whether there are any differences at all. Then, only compare files which have different hashes...


This will work fast:

Case 1 - File2 = File1 + extra text appended.

grep -Fxvf File2.txt File1.txt >> File3.txt

File 1: 80 Lines File 2: 100 Lines File 3: 20 Lines


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to unix

Docker CE on RHEL - Requires: container-selinux >= 2.9 What does `set -x` do? How to find files modified in last x minutes (find -mmin does not work as expected) sudo: npm: command not found How to sort a file in-place How to read a .properties file which contains keys that have a period character using Shell script gpg decryption fails with no secret key error Loop through a comma-separated shell variable Best way to find os name and version in Unix/Linux platform Resource u'tokenizers/punkt/english.pickle' not found