UnicodeDecodeError ascii codec can t decode byte 0xc3 in position 23 ordinal not in range 128

Question

when I try to concatenate this  I get the UnicodeDecodeError when the field contains      or       If the field that contains the      or      is the last I get no error         nombre   fabrica nombre   nombre encode  utf-8      -    sector encode  utf-8   nombre   nombre encode  utf-8      -    unidad encode  utf-8          return nombre    any  idea  Many thanks

User · Answer

I was getting this error when executing in python3 I got the same program working by simply executing in python2

User · Answer

When you get a UnicodeEncodeError  it means that somewhere in your code you convert directly a byte string to a unicode one  By default in Python 2 it uses ascii encoding  and utf8 encoding in Python3  both may fail because not every byte is valid in either encoding   To avoid that  you must use explicit decoding   If you may have 2 different encoding in your input file  one of them accepts any byte  say UTF8 and Latin1   you can try to first convert a string with first and use the second one if a UnicodeDecodeError occurs   def robust decode bs          Takes a byte string as param and convert it into a unicode one  First tries UTF8  and fallback to Latin1 if it fails        cr   None     try          cr   bs decode  utf8       except UnicodeDecodeError          cr   bs decode  latin1       return cr   If you do not know original encoding and do not care for non ascii character  you can set the optional errors parameter of the decode method to replace  Any offending byte will be replaced  from the standard library documentation       Replace with a suitable replacement character  Python will use the official U FFFD REPLACEMENT CHARACTER for the built-in Unicode codecs on decoding and         on encoding    bs decode errors  replace

User · Answer

You are encoding to UTF-8  then re-encoding to UTF-8  Python can only do this if it first decodes again to Unicode  but it has to use the default ASCII codec    gt  gt  gt  u     u  xf1   gt  gt  gt  u     encode  utf8     xc3 xb1   gt  gt  gt  u     encode  utf8   encode  utf8   Traceback  most recent call last     File   lt stdin gt    line 1  in  lt module gt  UnicodeDecodeError   ascii  codec can t decode byte 0xc3 in position 0  ordinal not in range 128    Don t keep encoding  leave encoding to UTF-8 to the last possible moment instead  Concatenate Unicode values instead   You can use str join    or  rather  unicode join    here to concatenate the three values with dashes in between   nombre   u -  join fabrica  sector  unidad  return nombre encode  utf-8     but even encoding here might be too early   Rule of thumb  decode the moment you receive the value  if not Unicode values supplied by an API already   encode only when you have to  if the destination API does not handle Unicode values directly

[python] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 23: ordinal not in range(128)

Examples related to python

Examples related to encoding

Examples related to utf-8