What charset does Microsoft Excel use when saving files

Question

I have a Java app which reads CSV files which have been created in Excel  e g  2007   Does anyone know what charset MS Excel uses to save these files in   I would have guessed either    windows-1255  Cp1255  ISO-8859-1 UTF8   but I am unable to decode extended chars  e g  french accentuated letters  using either of these charset types

User · Answer

I had a similar problem last week  I received a number of CSV files with varying encodings  Before importing into the database I then used the chardet libary to automatically sniff out the correct encoding   Chardet is a port from Mozillas character detection engine and if the sample size is large enough  one accentuated character will not do  works really well

User · Answer

Excel 2010 saves an UTF-16 UCS-2 TSV file  if you select File  gt  Save As  gt  Unicode Text   txt   It s  force  suffixed   txt   which you can change to   tsv     If you need CSV  you can then convert the TSV file in a text editor like Notepad    Ultra Edit  Crimson Editor etc  replacing tabs by semi-colons  commas or the like  Note that e g  for reading into a DB table  often TSV works fine already  and it is often easier to read manually     If you need a different code page like UTF-8  use one of the above mentioned editors for converting

User · Answer

While it is true that exporting an excel file that contains special characters to csv can be a pain in the ass  there is however a simple work around  simply copy paste the cells into a google docs and then save from there

User · Answer

You could use this Visual Studio VB Net code to get the encoding   Dim strEncryptionType As String   String Empty Dim myStreamRdr As System IO StreamReader   New System IO StreamReader myFileName  True  Dim myString As String   myStreamRdr ReadToEnd   strEncryptionType   myStreamRdr CurrentEncoding EncodingName

User · Answer

I had a similar problem last week  I received a number of CSV files with varying encodings  Before importing into the database I then used the chardet libary to automatically sniff out the correct encoding   Chardet is a port from Mozillas character detection engine and if the sample size is large enough  one accentuated character will not do  works really well

User · Answer

OOXML files like those that come from Excel 2007 are encoded in UTF-8  according to wikipedia   I don t know about CSV files  but it stands to reason it would use the same format

User · Answer

OOXML files like those that come from Excel 2007 are encoded in UTF-8  according to wikipedia   I don t know about CSV files  but it stands to reason it would use the same format

User · Answer

Russian Edition offers CSV  CSV  Macintosh  and CSV  DOS    When saving in plain CSV  it uses windows-1251   I just tried to save French word R  sum   along with the Russian text  it saved it in HEX like 52 3F 73 75 6D 3F  3F being the ASCII code for question mark   When I opened the CSV file  the word  of course  became unreadable  R sum

User · Answer

From memory  Excel uses the machine-specific ANSI encoding  So this would be Windows-1252 for a EN-US installation  1251 for Russian  etc

User · Answer

From memory  Excel uses the machine-specific ANSI encoding  So this would be Windows-1252 for a EN-US installation  1251 for Russian  etc

User · Answer

cp1250 is used extensively in Microsoft Office documents  including Word and Excel 2003   http   en wikipedia org wiki Windows-1250  A simple way to confirm this would be to    Create a spreadsheet with higher order characters  e g   Veszpr  m  in one of the cells  Use your favourite scripting language to parse and decode the spreadsheet  Look at what your script produces when you print out the decoded data    Example perl script     perl  use strict   use Spreadsheet  ParseExcel  Simple  use Encode qw  decode     my  file       my spreadsheet xls    my  xls       Spreadsheet  ParseExcel  Simple- gt read   file    my  sheet        xls- gt sheets  - gt  0    while   sheet- gt has data         my  data    sheet- gt next row       for my  datum    data             print decode   cp1250    datum

User · Answer

OOXML files like those that come from Excel 2007 are encoded in UTF-8  according to wikipedia   I don t know about CSV files  but it stands to reason it would use the same format

User · Answer

cp1250 is used extensively in Microsoft Office documents  including Word and Excel 2003   http   en wikipedia org wiki Windows-1250  A simple way to confirm this would be to    Create a spreadsheet with higher order characters  e g   Veszpr  m  in one of the cells  Use your favourite scripting language to parse and decode the spreadsheet  Look at what your script produces when you print out the decoded data    Example perl script     perl  use strict   use Spreadsheet  ParseExcel  Simple  use Encode qw  decode     my  file       my spreadsheet xls    my  xls       Spreadsheet  ParseExcel  Simple- gt read   file    my  sheet        xls- gt sheets  - gt  0    while   sheet- gt has data         my  data    sheet- gt next row       for my  datum    data             print decode   cp1250    datum

User · Answer

OOXML files like those that come from Excel 2007 are encoded in UTF-8  according to wikipedia   I don t know about CSV files  but it stands to reason it would use the same format

User · Answer

Waking up this old thread    We are now in 2017  And still Excel is unable to save a simple spreadsheet into a CSV format while preserving the original encoding     Just amazing   Luckily Google Docs lives in the right century  The solution for me is just to open the spreadsheet using Google Docs  than download it back down as CSV  The result is a correctly encoded CSV file  with all strings encoded in UTF8

User · Answer

I had a similar problem last week  I received a number of CSV files with varying encodings  Before importing into the database I then used the chardet libary to automatically sniff out the correct encoding   Chardet is a port from Mozillas character detection engine and if the sample size is large enough  one accentuated character will not do  works really well

User · Answer

Excel 2010 saves an UTF-16 UCS-2 TSV file  if you select File  gt  Save As  gt  Unicode Text   txt   It s  force  suffixed   txt   which you can change to   tsv     If you need CSV  you can then convert the TSV file in a text editor like Notepad    Ultra Edit  Crimson Editor etc  replacing tabs by semi-colons  commas or the like  Note that e g  for reading into a DB table  often TSV works fine already  and it is often easier to read manually     If you need a different code page like UTF-8  use one of the above mentioned editors for converting

User · Answer

Russian Edition offers CSV  CSV  Macintosh  and CSV  DOS    When saving in plain CSV  it uses windows-1251   I just tried to save French word R  sum   along with the Russian text  it saved it in HEX like 52 3F 73 75 6D 3F  3F being the ASCII code for question mark   When I opened the CSV file  the word  of course  became unreadable  R sum

User · Answer

Russian Edition offers CSV  CSV  Macintosh  and CSV  DOS    When saving in plain CSV  it uses windows-1251   I just tried to save French word R  sum   along with the Russian text  it saved it in HEX like 52 3F 73 75 6D 3F  3F being the ASCII code for question mark   When I opened the CSV file  the word  of course  became unreadable  R sum

User · Answer

I had a similar problem last week  I received a number of CSV files with varying encodings  Before importing into the database I then used the chardet libary to automatically sniff out the correct encoding   Chardet is a port from Mozillas character detection engine and if the sample size is large enough  one accentuated character will not do  works really well

User · Answer

Russian Edition offers CSV  CSV  Macintosh  and CSV  DOS    When saving in plain CSV  it uses windows-1251   I just tried to save French word R  sum   along with the Russian text  it saved it in HEX like 52 3F 73 75 6D 3F  3F being the ASCII code for question mark   When I opened the CSV file  the word  of course  became unreadable  R sum

User · Answer

You could use this Visual Studio VB Net code to get the encoding   Dim strEncryptionType As String   String Empty Dim myStreamRdr As System IO StreamReader   New System IO StreamReader myFileName  True  Dim myString As String   myStreamRdr ReadToEnd   strEncryptionType   myStreamRdr CurrentEncoding EncodingName

User · Answer

You can create CSV file using encoding UTF8   BOM  https   en wikipedia org wiki Byte order mark   First three bytes are BOM  0xEF 0xBB 0xBF  and then UTF8 content

User · Answer

While it is true that exporting an excel file that contains special characters to csv can be a pain in the ass  there is however a simple work around  simply copy paste the cells into a google docs and then save from there

User · Answer

CSV files could be in any format  depending on what encoding option was specified during the export from Excel   Save Dialog  Tools Button  Web Options Item  Encoding Tab   UPDATE  Excel  including Office 2013  doesn t actually respect the web options selected in the  save as     dialog  so this is a bug of some sort   I just use OpenOffice Calc now to open my XLSX files and export them as CSV files  edit filter settings  choose UTF-8 encoding

User · Answer

You can create CSV file using encoding UTF8   BOM  https   en wikipedia org wiki Byte order mark   First three bytes are BOM  0xEF 0xBB 0xBF  and then UTF8 content

User · Answer

Waking up this old thread    We are now in 2017  And still Excel is unable to save a simple spreadsheet into a CSV format while preserving the original encoding     Just amazing   Luckily Google Docs lives in the right century  The solution for me is just to open the spreadsheet using Google Docs  than download it back down as CSV  The result is a correctly encoded CSV file  with all strings encoded in UTF8

User · Answer

CSV files could be in any format  depending on what encoding option was specified during the export from Excel   Save Dialog  Tools Button  Web Options Item  Encoding Tab   UPDATE  Excel  including Office 2013  doesn t actually respect the web options selected in the  save as     dialog  so this is a bug of some sort   I just use OpenOffice Calc now to open my XLSX files and export them as CSV files  edit filter settings  choose UTF-8 encoding

[excel] What charset does Microsoft Excel use when saving files?

Examples related to excel

Examples related to encoding

Examples related to character-encoding