UnicodeEncodeError ascii codec can t encode character u u2013 in position 3 2 ordinal not in range 128

Question

I am parsing an xsl file using xlrd  Most of the things are working fine  I have a dictionary where keys are strings and values are lists of strings  All the keys and values are unicode  I can print most of the keys and values using str   method  But some values have the unicode character -  u2013 for which I get the above error    I suspect that this is happening because this is unicode embedded in unicode and python interpreter cannot decode it  So how can I get rid of this error    Thanks in advance

User · Answer

for me this works  unicode data  encode  utf-8

User · Answer

You can print Unicode  objects as well  you don t need to do str   around it   Assuming you really want a str   When you do str u  u2013   you are trying to convert the Unicode string to a 8-bit string  To do this you need to use an encoding  a mapping between Unicode data to 8-bit data  What str   does is that is uses the system default encoding  which under Python 2 is ASCII  ASCII contains only the 127 first code points of Unicode  that is  u0000 to  u007F1  The result is that you get the above error  the ASCII codec just doesn t know what  u2013 is  it s a long dash  btw    You therefore need to specify which encoding you want to use  Common ones are ISO-8859-1  most commonly known as Latin-1  which contains the 256 first code points  UTF-8  which can encode all code-points by using variable length encoding  CP1252 that is common on Windows  and various Chinese and Japanese encodings   You use them like this   u  u2013  encode  utf8     The result is a str containing a sequence of bytes that is the uTF8 representation of the character in question     xe2 x80 x93    And you can print it    gt  gt  gt  print   xe2 x80 x93

User · Answer

You can also try this to get the text   foo encode  ascii    ignore

User · Answer

I had the same problem  This work fine for me   str objdata  encode  utf-8

User · Answer

I had exactly this issue in a recent project which really is a pain in the rear  I finally found it s because the Python we used in Docker has encoding  ansi x3 4-1968  instead of  utf-8   So if anyone out there using Docker and got this error  following these steps may thoroughly solve your problem    create a file and name it default locale in the same directory of your Dockerfile  put this line in it   environment LANG  es ES utf8   LC ALL  es ES UTF-8   LC LANG  es ES UTF-8  add these to your Dockerfile   RUN apt-get clean  amp  amp  apt-get update  amp  amp  apt-get install -y locales  RUN locale-gen en CA UTF-8  COPY   default locale  etc default locale  RUN chmod 0755  etc default locale  ENV LC ALL en CA UTF-8  ENV LANG en CA UTF-8  ENV LANGUAGE en CA UTF-8   This thoroughly solved my issue when I built and run my Docker again  hopefully this solve your issue also

User · Answer

As here str u  u2013   is causing error so use isinstance foo basestring  to check for unicode string   if not of type base string convert it into Unicode and then apply encode  if isinstance foo basestring       foo encode  utf8   else      unicode foo  encode  utf8     further read

[python] UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 3 2: ordinal not in range(128)

Examples related to python