for line in results in UnicodeDecodeError utf-8 codec can t decode byte

Question

Here is my code  for line in open  u item      Read each line  Whenever I run this code it gives the following error   UnicodeDecodeError   utf-8  codec can t decode byte 0xe9 in position 2892  invalid continuation byte  I tried to solve this and add an extra parameter in open    The code looks like  for line in open  u item   encoding  utf-8      Read each line  But again it gives the same error  What should I do then

User · Answer

Open your file with Notepad    select  quot Encoding quot  or  quot Encodage quot  menu to identify or to convert from ANSI to UTF-8 or the ISO  8859-1 code page

User · Answer

This is an example for converting a CSV file in Python 3  try      inputReader   csv reader open argv 1   encoding  ISO-8859-1    delimiter     quotechar   quot    except IOError      pass

User · Answer

This works  open  filename   encoding  latin-1    Or  open  filename   encoding  quot ISO-8859-1 quot

User · Answer

Try this to read using Pandas  pd read csv  u item   sep      names m cols  encoding  latin-1

User · Answer

As suggested by Mark Ransom  I found the right encoding for that problem  The encoding was  quot ISO-8859-1 quot   so replacing open  quot u item quot   encoding  quot utf-8 quot   with open  u item   encoding    quot ISO-8859-1 quot   will solve the problem

User · Answer

Your file doesn t actually contain UTF-8 encoded data  it contains some other encoding  Figure out what that encoding is and use it in the open call  In Windows-1252 encoding  for example  the 0xe9 would be the character

User · Answer

You can try this way  open  u item   encoding  utf8   errors  ignore

User · Answer

Sometimes when using open filepath  in which filepath actually is not a file would get the same error  so firstly make sure the file you re trying to open exists  import os assert os path isfile filepath

User · Answer

You could resolve the problem with  for line in open your file path   rb      rb  is reading the file in binary mode  Read more here

User · Answer

The following also worked for me  ISO  8859-1 is going to save a lot  hahaha - mainly if using Speech Recognition APIs  Example  file   open     Resources     filename   r   encoding  quot ISO-8859-1 quot

User · Answer

If you are using Python 2  the following will be the solution  import io for line in io open  quot u item quot   encoding  quot ISO-8859-1 quot          Do something  Because the encoding parameter doesn t work with open    you will be getting the following error   TypeError   encoding  is an invalid keyword argument for this function

[python] "for line in..." results in UnicodeDecodeError: 'utf-8' codec can't decode byte

Examples related to python

Examples related to python-3.x

Examples related to character-encoding