Get encoding of a file in Windows

Question

This isn t really a programming question  is there a command line or Windows tool  Windows 7  to get the current encoding of a text file  Sure I can write a little C  app but I wanted to know if there is something already built in

User · Answer

A simple solution might be opening the file in Firefox.

Drag and drop the file into firefox
Right click on the page
Select "View Page Info"

and the text encoding will appear on the "Page Info" window.

Note: If the file is not in txt format, just rename it to txt and try again.

P.S. For more info see this article.

User · Answer

Here s my take how to detect the Unicode family of text encodings via BOM   The accuracy of this method is low  as this method only works on text files  specifically Unicode files   and defaults to ascii when no BOM is present  like most text editors  the default would be UTF8 if you want to match the HTTP web ecosystem    Update 2018  I no longer recommend this method   I recommend using file exe from GIT or  nix tools as recommended by  Sybren  and I show how to do that via PowerShell in a later answer     from https   gist github com zommarin 1480974 function Get-FileEncoding  Path         bytes    byte    Get-Content  Path -Encoding byte -ReadCount 4 -TotalCount 4       if   bytes    return  utf8         switch -regex    0 x2  1 x2  2 x2  3 x2   -f  bytes 0   bytes 1   bytes 2   bytes 3               efbbbf      return  utf8              2b2f76      return  utf7              fffe        return  unicode              feff        return  bigendianunicode              0000feff    return  utf32            default       return  ascii             dir   Documents WindowsPowershell -File        select Name   Name  Encoding  Expression  Get-FileEncoding    FullName          ft -AutoSize   Recommendation  This can work reasonably well if the dir  ls  or Get-ChildItem only checks known text files  and when you re only looking for  bad encodings  from a known list of tools   i e  SQL Management Studio defaults to UTF16  which broke GIT auto-cr-lf for Windows  which was the default for many years

User · Answer

Open up your file using regular old vanilla Notepad that comes with Windows  It will show you the encoding of the file when you click  Save As      It ll look like this    Whatever the default-selected encoding is  that is what your current encoding is for the file   If it is UTF-8  you can change it to ANSI and click save to change the encoding  or visa-versa    I realize there are many different types of encoding  but this was all I needed when I was informed our export files were in UTF-8 and they required ANSI   It was a onetime export  so Notepad fit the bill for me   FYI  From my understanding I think  Unicode   as listed in Notepad  is a misnomer for UTF-16  More here on Notepad s  Unicode  option  Windows 7 - UTF-8 and Unicdoe

User · Answer

Another tool that I found useful  https   archive codeplex com  p encodingchecker EXE can be found here

User · Answer

Some C code here for reliable ascii  bom s  and utf8 detection   https   unicodebook readthedocs io guess encoding html     Only ASCII  UTF-8 and encodings using a BOM  UTF-7 with BOM  UTF-8 with BOM        UTF-16  and UTF-32  have reliable algorithms to get the encoding of a document         For all other encodings  you have to trust heuristics based on statistics    EDIT   A powershell version of a C  answer from   Effective way to find any file  39 s Encoding   Only works with signatures  boms      get-encoding ps1 param  Parameter ValueFromPipeline  True    filename      begin       set  net current directoy                                                                                                       Environment   CurrentDirectory    pwd  path   process      reader    System IO StreamReader   new  filename        System Text Encoding   default  true     peek    reader Peek      encoding    reader currentencoding    reader close      pscustomobject   Name split-path  filename -leaf                 BodyName  encoding BodyName                 EncodingName  encoding EncodingName        get-encoding chinese8 txt  Name         BodyName EncodingName ----         -------- ------------ chinese8 txt utf-8    Unicode  UTF-8    get-childitem -file     get-encoding

User · Answer

Similar to the solution listed above with Notepad  you can also open the file in Visual Studio  if you re using that  In Visual Studio  you can select  File   Advanced Save Options      The  Encoding   combo box will tell you specifically which encoding is currently being used for the file  It has a lot more text encodings listed in there than Notepad does  so it s useful when dealing with various files from around the world and whatever else   Just like Notepad  you can also change the encoding from the list of options there  and then saving the file after hitting  OK   You can also select the encoding you want through the  Save with Encoding     option in the Save As dialog  by clicking the arrow next to the Save button

User · Answer

The  Linux  command-line tool  file  is available on Windows via GnuWin32   http   gnuwin32 sourceforge net packages file htm  If you have git installed  it s located in C  Program Files git usr bin    Example        C  Users SH Downloads SquareRoot file        UpgradeReport Files          directory     Debug                         directory     duration h                    ASCII C   program text  with CRLF line terminators     ipch                          directory     main cpp                      ASCII C program text  with CRLF line terminators     Precision txt                 ASCII text  with CRLF line terminators     Release                       directory     Speed txt                     ASCII text  with CRLF line terminators     SquareRoot sdf                data     SquareRoot sln                UTF-8 Unicode  with BOM  text  with CRLF line terminators     SquareRoot sln docstates suo  PCX ver  2 5 image data     SquareRoot suo                CDF V2 Document  corrupt  Cannot read summary info     SquareRoot vcproj             XML  document text     SquareRoot vcxproj            XML document text     SquareRoot vcxproj filters    XML document text     SquareRoot vcxproj user       XML document text     squarerootmethods h           ASCII C program text  with CRLF line terminators     UpgradeLog XML                XML  document text      C  Users SH Downloads SquareRoot file --mime-encoding        UpgradeReport Files          binary     Debug                         binary     duration h                    us-ascii     ipch                          binary     main cpp                      us-ascii     Precision txt                 us-ascii     Release                       binary     Speed txt                     us-ascii     SquareRoot sdf                binary     SquareRoot sln                utf-8     SquareRoot sln docstates suo  binary     SquareRoot suo                CDF V2 Document  corrupt  Cannot read summary infobinary     SquareRoot vcproj             us-ascii     SquareRoot vcxproj            utf-8     SquareRoot vcxproj filters    utf-8     SquareRoot vcxproj user       utf-8     squarerootmethods h           us-ascii     UpgradeLog XML                us-ascii

User · Answer

I wrote the  4 answer  at time of writing    But lately I have git installed on all my computers  so now I use  Sybren s solution   Here is a new answer that makes that solution handy from powershell  without putting all of git usr bin in the PATH  which is too much clutter for me    Add this to your profile ps1    global gitbin    C  Program Files Git usr bin  Set-Alias file exe  gitbin file exe   And used like  file exe --mime-encoding     You must include  exe in the command for PS alias to work     But if you don t customize your PowerShell profile ps1 I suggest you start with mine  https   gist github com yzorg 8215221 8e38fd722a3dfc526bbe4668d1f3b08eb7c08be0 and save it to   Documents WindowsPowerShell   It s safe to use on a computer without git  but will write warnings when git is not found   The  exe in the command is also how I use C  WINDOWS system32 where exe from powershell  and many other OS CLI commands that are  hidden by default  by powershell   shrug

User · Answer

EncodingChecker File Encoding Checker is a GUI tool that allows you to validate the text encoding of one or more files  The tool can display the encoding for all selected files  or only the files that do not have the encodings you specify  File Encoding Checker requires  NET 4 or above to run

User · Answer

The only way that I have found to do this is VIM or Notepad

User · Answer

Install git   on Windows  you have to use git bash console   Type     file        for all files in the current directory   or     file          for the  files in all  subdirectories

User · Answer

Looking for a Node js npm solution  Try encoding-checker  npm install -g encoding-checker  Usage Usage  encoding-checker  -p pattern   -i encoding   -v    Options    --help                 Show help                                      boolean    --version              Show version number                            boolean    --pattern  -p  -d                                                default   quot   quot     --ignore-encoding  -i                                             default   quot  quot     --verbose  -v                                                  default  false   Examples Get encoding of all files in current directory  encoding-checker  Return encoding of all md files in current directory  encoding-checker -p  quot   md quot   Get encoding of all files in current directory and its subfolders  will take quite some time for huge folders  seemingly unresponsive   encoding-checker -p  quot    quot   For more examples refer to the npm docu or the official repository

User · Answer

If you have  git  or  Cygwin  on your Windows Machine  then go to the folder where your file is present and execute the command   file     This will give you the encoding details of all the files in that folder

[windows] Get encoding of a file in Windows

Examples related to windows

Examples related to encoding