Every answer seems to suggest that you need to use secure hashes to do the job but all of these are tuned to be slow to force a bruteforce attacker to have lots of computing power and depending on your needs this may not be the best solution.
There are algorithms specifically designed to hash files as fast as possible to check integrity and comparison (murmur
, XXhash
...). Obviously these are not designed for security as they don't meet the requirements of a secure hash algorithm (i.e. randomness) but have low collision rates for large messages. This features make them ideal if you are not looking for security but speed.
Examples of this algorithms and comparison can be found in this excellent answer: Which hashing algorithm is best for uniqueness and speed?.
As an example, we at our Q&A site use murmur3
to hash the images uploaded by the users so we only store them once even if users upload the same image in several answers.