[python] Get MD5 hash of big files in Python

Using multiple comment/answers in this thread, here is my solution :

import hashlib
def md5_for_file(path, block_size=256*128, hr=False):
    '''
    Block size directly depends on the block size of your filesystem
    to avoid performances issues
    Here I have blocks of 4096 octets (Default NTFS)
    '''
    md5 = hashlib.md5()
    with open(path,'rb') as f: 
        for chunk in iter(lambda: f.read(block_size), b''): 
             md5.update(chunk)
    if hr:
        return md5.hexdigest()
    return md5.digest()
  • This is "pythonic"
  • This is a function
  • It avoids implicit values: always prefer explicit ones.
  • It allows (very important) performances optimizations

And finally,

- This has been built by a community, thanks all for your advices/ideas.