UTF-8 problems while reading CSV file with fgetcsv

Question

I try to read a CSV and echo the content  But the content displays the characters wrong    M  x M  sterm  nn -  M    x  M    sterm    nn  Encoding of the CSV file is UTF-8 without BOM  checked with Notepad      This is the content of the CSV file    M  x   M  sterm  nn   My PHP script   lt  DOCTYPE html PUBLIC  -  W3C  DTD XHTML 1 0 Transitional  EN   http   www w3 org TR xhtml1 DTD xhtml1-transitional dtd  gt   lt html xmlns  http   www w3 org 1999 xhtml  gt   lt head gt   lt meta http-equiv  Content-Type  content  text html  charset utf-8    gt   lt  head gt   lt body gt   lt  php  handle   fopen   specialchars csv   r    echo   lt table border  1  gt  lt tr gt  lt td gt First name lt  td gt  lt td gt Last name lt  td gt  lt  tr gt  lt tr gt    while   data   fgetcsv   handle  1000                   num   count   data           for   c 0   c  lt   num   c                     output data             echo   lt td gt  data  c  lt  td gt                      echo   lt  tr gt  lt tr gt        gt   lt  body gt   lt  html gt    I tried to use setlocale LC ALL   de DE utf8    as suggested here without success  The content is still wrong displayed   What I m missing   Edit   An echo mb detect encoding  data  c   UTF-8    gives me UTF-8 UTF-8   echo file get contents  specialchars csv    gives me  M    x   M    sterm    nn    And  print r str getcsv reset explode   n   file get contents  specialchars csv              gives me  Array    0    gt  M    x  1    gt  M    sterm    nn    What does it mean

User · Answer

Try this    lt  php  handle   fopen   specialchars csv   r    echo   lt table border  1  gt  lt tr gt  lt td gt First name lt  td gt  lt td gt Last name lt  td gt  lt  tr gt  lt tr gt    while   data   fgetcsv   handle  1000                   data   array map  utf8 encode    data     added          num   count   data           for   c 0   c  lt   num   c                     output data             echo   lt td gt  data  c  lt  td gt                      echo   lt  tr gt  lt tr gt        gt

User · Answer

Encountered similar problem  parsing CSV file with special characters like            etc      The following worked fine for me   To represent the characters correctly on the html page  the header was needed     header  Content-Type  text html  charset UTF-8      In order to parse every character correctly  I used    utf8 encode fgets  file      Dont forget to use in all following string operations the  Multibyte String Functions   like   mb strtolower  value   UTF-8

User · Answer

In my case the source file has windows-1250 encoding and iconv prints tons of notices about illegal characters in input string     So this solution helped me a lot           getting CSV array with UTF-8 encoding        param   resource     amp  handle     param   integer      length     param   string       separator        return  array false     private function fgetcsvUTF8  amp  handle   length   separator              if    buffer   fgets  handle   length       false                 buffer    this- gt autoUTF  buffer           return str getcsv  buffer   separator             return false            automatic convertion windows-1250 and iso-8859-2 info utf-8 string        param   string   s        return  string     private function autoUTF  s           detect UTF-8     if  preg match     x80- x 1FF  x 2000 - x 3FFF   u    s           return  s          detect WINDOWS-1250     if  preg match     x7F- x9F xBC      s           return iconv  WINDOWS-1250    UTF-8    s           assume ISO-8859-2     return iconv  ISO-8859-2    UTF-8    s       Response to  manvel s answer - use str getcsv instead of explode - because of cases like this   some nice value  and here comes combinated value  and some others   explode will explode string into parts   some nice value  and here comes combinated value  and some others   but str getcsv will explode string into parts   some nice value and here comes combinated value and some others

User · Answer

The problem is that the function returns UTF-8  it can check using mb detect encoding   but do not convert  and these characters takes as UTF-8   herefore  it s necessary to do the reverse-convert to initial encoding  Windows-1251 or CP1251  using iconv  But since by the fgetcsv returns an array  I suggest to write a custom function   Sorry for my english   function customfgetcsv  amp  handle   length   separator             if    buffer   fgets  handle   length       false            return explode  separator  iconv  CP1251    UTF-8    buffer              return false

User · Answer

Now I got it working  after removing the header command   I think the problem was that the encoding of the php file was in ISO-8859-1  I set it to UTF-8 without BOM  I thought I already have done that  but perhaps I made an additional undo   Furthermore  I used SET NAMES  utf8  for the database  Now it is also correct in the database

User · Answer

Try putting this into the top of your file  before any other output     lt  php  header  Content-Type  text html  charset UTF-8       gt

[php] UTF-8 problems while reading CSV file with fgetcsv

Examples related to php

Examples related to csv

Examples related to utf-8

Examples related to fgetcsv