How to find the mime type of a file in python

Question

Let s say you want to save a bunch of files somewhere  for instance in BLOBs  Let s say you want to dish these files out via a web page and have the client automatically open the correct application viewer   Assumption  The browser figures out which application viewer to use by the mime-type  content-type   header in the HTTP response   Based on that assumption  in addition to the bytes of the file  you also want to save the MIME type   How would you find the MIME type of a file  I m currently on a Mac  but this should also work on Windows    Does the browser add this information when posting the file to the web page   Is there a neat python library for finding this information  A WebService or  even better  a downloadable database

User · Answer

python 3 ref  https   docs python org 3 2 library mimetypes html     mimetypes guess type url  strict True  Guess the type of a file based   on its filename or URL  given by url  The return value is a tuple    type  encoding  where type is None if the type can   t be guessed    missing or unknown suffix  or a string of the form  type subtype     usable for a MIME content-type header       encoding is None for no encoding or the name of the program used to   encode  e g  compress or gzip   The encoding is suitable for use as a   Content-Encoding header  not as a Content-Transfer-Encoding header    The mappings are table driven  Encoding suffixes are case sensitive    type suffixes are first tried case sensitively  then case   insensitively       The optional strict argument is a flag specifying whether the list of   known MIME types is limited to only the official types registered with   IANA  When strict is True  the default   only the IANA types are   supported  when strict is False  some additional non-standard but   commonly used MIME types are also recognized    import mimetypes print mimetypes guess type  sample html

User · Answer

in python 2 6  import shlex import subprocess mime   subprocess Popen  quot  usr bin file --mime  quot    shlex quote PATH   shell True        stdout subprocess PIPE  communicate   0

User · Answer

I m surprised that nobody has mentioned it but Pygments is able to make an educated guess about the mime-type of  particularly  text documents  Pygments is actually a Python syntax highlighting library but is has a method that will make an educated guess about which of 500 supported document types your document is  i e  c   vs C  vs Python vs etc import inspect  def  test text  str       from pygments lexers import guess lexer     lexer   guess lexer text      mimetype   lexer mimetypes 0  if lexer mimetypes else None     print mimetype   if   name       quot   main   quot         Set the text to the actual defintion of  test      above     text   inspect getsource  test      print  Text        print text      print       print  Result         test text   Output  Text  def  test text  str       from pygments lexers import guess lexer     lexer   guess lexer text      mimetype   lexer mimetypes 0  if lexer mimetypes else None     print mimetype    Result  text x-python  Now  it s not perfect  but if you need to be able to tell which of 500 document formats are being used  this is pretty darn useful

User · Answer

This seems to be very easy  gt  gt  gt  from mimetypes import MimeTypes  gt  gt  gt  import urllib   gt  gt  gt  mime   MimeTypes    gt  gt  gt  url   urllib pathname2url  Upload xml    gt  gt  gt  mime type   mime guess type url   gt  gt  gt  print mime type   application xml   None   Please refer Old Post Update - In python 3  version  it s more convenient now  import mimetypes print mimetypes guess type  quot sample html quot

User · Answer

There are 3 different libraries that wraps libmagic   2 of them are available on pypi  so pip install will work     filemagic python-magic   And another  similar to python-magic is available directly in the latest libmagic sources  and it is the one you probably have in your linux distribution   In Debian the package python-magic is about this one and it is used as toivotuo said and it is not obsoleted as Simon Zimmermann said  IMHO    It seems to me another take  by the original author of libmagic    Too bad is not available directly on pypi

User · Answer

In Python 3 x and webapp with url to the file which couldn t have an extension or a fake extension  You should install python-magic  using   pip3 install python-magic   For Mac OS X  you should also install libmagic using  brew install libmagic   Code snippet  import urllib import magic from urllib request import urlopen  url    http      url to the file      request   urllib request Request url  response   urlopen request  mime type   magic from buffer response readline    print mime type    alternatively you could put a size into the read  import urllib import magic from urllib request import urlopen  url    http      url to the file      request   urllib request Request url  response   urlopen request  mime type   magic from buffer response read 128   print mime type

User · Answer

you can use imghdr Python module

User · Answer

The mimetypes module just recognise an file type based on file extension  If you will try to recover a file type of a file without extension  the mimetypes will not works

User · Answer

2017 Update  No need to go to github  it is on PyPi under a different name   pip3 install --user python-magic   or  sudo apt install python3-magic    Ubuntu distro package   The code can be simplified as well    gt  gt  gt  import magic   gt  gt  gt  magic from file   tmp img 3304 jpg   mime True   image jpeg

User · Answer

This may be old already  but why not use UploadedFile content type directly from Django  Is not the same  https   docs djangoproject com en 1 11 ref files uploads  django core files uploadedfile UploadedFile content type

User · Answer

I try mimetypes library first  If it s not working  I use python-magic libary instead   import mimetypes def guess type filename  buffer None   mimetype  encoding   mimetypes guess type filename  if mimetype is None      try          import magic         if buffer              mimetype   magic from buffer buffer  mime True          else              mimetype   magic from file filename  mime True      except ImportError          pass return mimetype

User · Answer

More reliable way than to use the mimetypes library would be to use the python-magic package   import magic m   magic open magic MAGIC MIME  m load   m file   tmp document pdf     This would be equivalent to using file 1    On Django one could also make sure that the MIME type matches that of UploadedFile content type

User · Answer

I  ve tried a lot of examples but with Django mutagen plays nicely     Example checking if files is mp3    from mutagen mp3 import MP3  HeaderNotFoundError    try      audio   MP3 file  except HeaderNotFoundError      raise ValidationError  This file should be mp3     The downside is that your ability to check file types is limited  but it s a great way if you want not only check for file type but also to access additional information

User · Answer

toivotuo  s method worked best and most reliably for me under python3   My goal was to identify gzipped files which do not have a reliable  gz extension  I installed python3-magic   import magic  filename      datasets test   def file mime type filename       m   magic open magic MAGIC MIME      m load       return m file filename    print file mime type filename     for a gzipped file it returns  application gzip  charset binary  for an unzipped txt file  iostat data   text plain  charset us-ascii  for a tar file  application x-tar  charset binary  for a bz2 file   application x-bzip2  charset binary  and last but not least for me a  zip file   application zip  charset binary

User · Answer

The python-magic method suggested by toivotuo is outdated  Python-magic s current trunk is at Github and based on the readme there  finding the MIME-type  is done like this    For MIME types import magic mime   magic Magic mime True  mime from file  quot testdata test pdf quot      application pdf

User · Answer

The mimetypes module in the standard library will determine guess the MIME type from a file extension   If users are uploading files the HTTP post will contain the MIME type of the file alongside the data   For example  Django makes this data available as an attribute of the UploadedFile object

User · Answer

13 year later    Most of the answers for python 3 on this page are either outdated or incomplete  To get the mime type of a file on python3 I normally use  import mimetypes mt   mimetypes guess type  quot file ext quot   0    From Python docs  mimetypes guess type url  strict True  Guess the type of a file based on its filename  path or URL  given by url  URL can be a string or a path-like object  The return value is a tuple  type  encoding  where type is None if the type can   t be guessed  missing or unknown suffix  or a string of the form  type subtype   usable for a MIME content-type header  encoding is None for no encoding or the name of the program used to encode  e g  compress or gzip   The encoding is suitable for use as a Content-Encoding header  not as a Content-Transfer-Encoding header  The mappings are table driven  Encoding suffixes are case sensitive  type suffixes are first tried case sensitively  then case insensitively  The optional strict argument is a flag specifying whether the list of known MIME types is limited to only the official types registered with IANA  When strict is True  the default   only the IANA types are supported  when strict is False  some additional non-standard but commonly used MIME types are also recognized  Changed in version 3 8  Added support for url being a path-like object

User · Answer

Python bindings to libmagic  All the different answers on this topic are very confusing  so I   m hoping to give a bit more clarity with this overview of the different bindings of libmagic  Previously mammadori gave a short answer listing the available option   libmagic   module name  magic pypi  file-magic source  https   github com file file tree master python   When determining a files mime-type  the tool of choice is simply called file and its back-end is called libmagic   See the Project home page   The project is developed in a private cvs-repository  but there is a read-only git mirror on github   Now this tool  which you will need if you want to use any of the libmagic bindings with python  already comes with its own python bindings called file-magic  There is not much dedicated documentation for them  but you can always have a look at the man page of the c-library  man libmagic  The basic usage is described in the readme file   import magic  detected   magic detect from filename  magic py   print  Detected MIME type      format detected mime type  print  Detected encoding      format detected encoding  print  Detected file type name      format detected name    Apart from this  you can also use the library by creating a Magic object using magic open flags  as shown in the example file   Both toivotuo and ewr2san use these file-magic bindings included in the file tool  They mistakenly assume  they are using the python-magic package  This seems to indicate  that if both file and python-magic are installed  the python module magic refers to the former one   python-magic   module name  magic pypi  python-magic source  https   github com ahupp python-magic   This is the library that Simon Zimmermann talks about in his answer and which is also employed by Claude COULOMBE as well as Gringo Suave   filemagic   module name  magic pypi  filemagic source  https   github com aliles filemagic   Note  This project was last updated in 2013   Due to being based on the same c-api  this library has some similarity with file-magic included in libmagic  It is only mentioned by mammadori and no other answer employs it

User · Answer

You didn t state what web server you were using  but Apache has a nice little module called Mime Magic which it uses to determine the type of a file when told to do so   It reads some of the file s content and tries to figure out what type it is based on the characters found   And as Dave Webb Mentioned the MimeTypes Module under python will work  provided an extension is handy   Alternatively  if you are sitting on a UNIX box you can use sys popen  file -i     fileName  mode  r   to grab the MIME type   Windows should have an equivalent command  but I m unsure as to what it is

User · Answer

For byte Array type data you can use  magic from buffer  byte array mime True

[python] How to find the mime type of a file in python?

Examples related to python

Examples related to mime