What is the easiest way (using a graphical tool or command line on Ubuntu Linux) to know if two binary files are the same or not (except for the time stamps)? I do not need to actually extract the difference. I just need to know whether they are the same or not.
I found Visual Binary Diff was what I was looking for, available on:
Ubuntu:
sudo apt install vbindiff
Arch Linux:
sudo pacman -S vbindiff
Mac OS X via MacPorts:
port install vbindiff
Mac OS X via Homebrew:
brew install vbindiff
md5sum binary1 binary2
If the md5sum is same, binaries are same
E.g
md5sum new*
89c60189c3fa7ab5c96ae121ec43bd4a new.txt
89c60189c3fa7ab5c96ae121ec43bd4a new1.txt
root@TinyDistro:~# cat new*
aa55 aa55 0000 8010 7738
aa55 aa55 0000 8010 7738
root@TinyDistro:~# cat new*
aa55 aa55 000 8010 7738
aa55 aa55 0000 8010 7738
root@TinyDistro:~# md5sum new*
4a7f86919d4ac00c6206e11fca462c6f new.txt
89c60189c3fa7ab5c96ae121ec43bd4a new1.txt
Diff with the following options would do a binary comparison to check just if the files are different at all and it'd output if the files are the same as well:
diff -qs {file1} {file2}
If you are comparing two files with the same name in different directories, you can use this form instead:
diff -qs {file1} --to-file={dir2}
OS X El Capitan
Radiff2 is a tool designed to compare binary files, similar to how regular diff compares text files.
Try radiff2
which is a part of radare2
disassembler. For instance, with this command:
radiff2 -x file1.bin file2.bin
You get pretty formatted two columns output where differences are highlighted.
For finding flash memory defects, I had to write this script which shows all 1K blocks which contain differences (not only the first one as cmp -b
does)
#!/bin/sh
f1=testinput.dat
f2=testoutput.dat
size=$(stat -c%s $f1)
i=0
while [ $i -lt $size ]; do
if ! r="`cmp -n 1024 -i $i -b $f1 $f2`"; then
printf "%8x: %s\n" $i "$r"
fi
i=$(expr $i + 1024)
done
Output:
2d400: testinput.dat testoutput.dat differ: byte 3, line 1 is 200 M-^@ 240 M-
2dc00: testinput.dat testoutput.dat differ: byte 8, line 1 is 327 M-W 127 W
4d000: testinput.dat testoutput.dat differ: byte 37, line 1 is 270 M-8 260 M-0
4d400: testinput.dat testoutput.dat differ: byte 19, line 1 is 46 & 44 $
Disclaimer: I hacked the script in 5 min. It doesn't support command line arguments nor does it support spaces in file names
I ended up using hexdump to convert the binary files to there hex representation and then opened them in meld / kompare / any other diff tool. Unlike you I was after the differences in the files.
hexdump tmp/Circle_24.png > tmp/hex1.txt
hexdump /tmp/Circle_24.png > tmp/hex2.txt
meld tmp/hex1.txt tmp/hex2.txt
My favourite ones using xxd hex-dumper from the vim package :
1) using vimdiff (part of vim)
#!/bin/bash
FILE1="$1"
FILE2="$2"
vimdiff <( xxd "$FILE1" ) <( xxd "$FILE2" )
2) using diff
#!/bin/bash
FILE1=$1
FILE2=$2
diff -W 140 -y <( xxd $FILE1 ) <( xxd $FILE2 ) | colordiff | less -R -p ' \| '
There is a relatively simple way to check if two binary files are the same.
If you use file input/output in a programming language; you can store each bit of both the binary files into their own arrays.
At this point the check is as simple as :
if(file1 != file2){
//do this
}else{
/do that
}
Use sha1 to generate checksum:
sha1 [FILENAME1]
sha1 [FILENAME2]
Use cmp command. Refer to Binary Files and Forcing Text Comparisons for more information.
cmp -b file1 file2
You can use MD5 hash function to check if two files are the same, with this you can not see the differences in a low level, but is a quick way to compare two files.
md5 <filename1>
md5 <filename2>
If both MD5 hashes (the command output) are the same, then, the two files are not different.
Use cmp
command. This will either exit cleanly if they are binary equal, or it will print out where the first difference occurs and exit.
Short answer: run diff
with the -s
switch.
Long answer: read on below.
Here's an example. Let's start by creating two files with random binary contents:
$ dd if=/dev/random bs=1k count=1 of=test1.bin
1+0 records in
1+0 records out
1024 bytes (1,0 kB, 1,0 KiB) copied, 0,0100332 s, 102 kB/s
$ dd if=/dev/random bs=1k count=1 of=test2.bin
1+0 records in
1+0 records out
1024 bytes (1,0 kB, 1,0 KiB) copied, 0,0102889 s, 99,5 kB/s
Now let's make a copy of the first file:
$ cp test1.bin copyoftest1.bin
Now test1.bin and test2.bin should be different:
$ diff test1.bin test2.bin
Binary files test1.bin and test2.bin differ
... and test1.bin and copyoftest1.bin should be identical:
$ diff test1.bin copyoftest1.bin
But wait! Why is there no output?!?
The answer is: this is by design. There is no output on identical files.
But there are different error codes:
$ diff test1.bin test2.bin
Binary files test1.bin and test2.bin differ
$ echo $?
1
$ diff test1.bin copyoftest1.bin
$ echo $?
0
Now fortunately you don't have to check error codes each and every time because you can just use the -s
(or --report-identical-files
) switch to make diff be more verbose:
$ diff -s test1.bin copyoftest1.bin
Files test1.bin and copyoftest1.bin are identical
Source: Stackoverflow.com