UnicodeDecodeError ascii codec can t decode byte 0xef in position 1

Question

I m having a few issues trying to encode a string to UTF-8  I ve tried numerous things  including using string encode  utf-8   and unicode string   but I get the error      UnicodeDecodeError   ascii  codec can t decode byte 0xef in position 1  ordinal not in range 128    This is my string              I don t see what s going wrong  any idea   Edit  The problem is that printing the string as it is does not show properly  Also  this error when I try to convert it   Python 2 7 1   r271 86832  Apr 11 2011  18 13 53   GCC 4 5 2  on linux2 Type  help    copyright    credits  or  license  for more information   gt  gt  gt  s      xef xbd xa1 xef xbd xa5 xcf x89 xef xbd xa5 xef xbd xa1  xef xbe x89   gt  gt  gt  s1   s decode  utf-8    gt  gt  gt  print s1 Traceback  most recent call last     File   lt stdin gt    line 1  in  lt module gt  UnicodeEncodeError   ascii  codec can t encode characters in position 1-5  ordinal not in range 128

User · Answer

this works for ubuntu 15 10   sudo locale-gen  en US UTF-8  sudo dpkg-reconfigure locales

User · Answer

This is to do with the encoding of your terminal not being set to UTF-8   Here is my terminal    echo  LANG en GB UTF-8   python Python 2 7 3  default  Apr 20 2012  22 39 59    GCC 4 6 3  on linux2 Type  help    copyright    credits  or  license  for more information   gt  gt  gt  s      xef xbd xa1 xef xbd xa5 xcf x89 xef xbd xa5 xef xbd xa1  xef xbe x89   gt  gt  gt  s1   s decode  utf-8    gt  gt  gt  print s1           gt  gt  gt     On my terminal the example works with the above  but if I get rid of the LANG setting then it won t work    unset LANG   python Python 2 7 3  default  Apr 20 2012  22 39 59    GCC 4 6 3  on linux2 Type  help    copyright    credits  or  license  for more information   gt  gt  gt  s      xef xbd xa1 xef xbd xa5 xcf x89 xef xbd xa5 xef xbd xa1  xef xbe x89   gt  gt  gt  s1   s decode  utf-8    gt  gt  gt  print s1 Traceback  most recent call last     File   lt stdin gt    line 1  in  lt module gt  UnicodeEncodeError   ascii  codec can t encode characters in position 1-5  ordinal not in range 128   gt  gt  gt     Consult the docs for your linux variant to discover how to make this change permanent

User · Answer

This is the best answer  https   stackoverflow com a 4027726 2159089  in linux   export PYTHONIOENCODING utf-8   so sys stdout encoding is OK

User · Answer

I had the same error  with URLs containing non-ascii chars  bytes with values   128   url   url decode  utf8   encode  utf-8     Worked for me  in Python 2 7  I suppose this assignment changed  something  in the str internal representation--i e   it forces the right decoding of the backed byte sequence in url and finally puts the string into a utf-8 str with all the magic in the right place  Unicode in Python is black magic for me  Hope useful

User · Answer

It looks like your string is encoded to utf-8  so what exactly is the problem   Or what are you trying to do here     Python 2 7 3  default  Apr 20 2012  22 39 59    GCC 4 6 3  on linux2 Type  help    copyright    credits  or  license  for more information   gt  gt  gt  s      xef xbd xa1 xef xbd xa5 xcf x89 xef xbd xa5 xef xbd xa1  xef xbe x89   gt  gt  gt  s1   s decode  utf-8    gt  gt  gt  print s1           gt  gt  gt  s2   u            gt  gt  gt  s2    s1 True  gt  gt  gt  s2 u   uff61 uff65 u03c9 uff65 uff61  uff89

User · Answer

BOM  it s so often BOM for me  vi the file  use   set nobomb   and save it  That nearly always fixes it in my case

User · Answer

In my case  it was caused by my Unicode file being saved with a  BOM   To solve this  I cracked open the file using BBEdit and did a  Save as     choosing for encoding  Unicode  UTF-8   and not what it came with which was  Unicode  UTF-8  with BOM

User · Answer

I was getting the same type of error  and I found that the console is not capable of displaying the string in another language  Hence I made the below code changes to set default charset as UTF-8    data head       x81 xa1 x8fo x89 xef x82 xa2 x95 xdb x8f xd8 x90 xa7 x93x x81 xcb3 x8c x8e x8cp x91 xb1 x92 x86  x81 x86 x81 xde x81 x85  x81 xa1 x8f x89 x89 xf1 x88 xc8 x8aO x81A x82 xa8 x8b xe0 x82 xcc x90S x94z x82 xcd x88 xea x90 xd8 x95s x97v x81 xa1 x83  x83b x83v x82 xcc x82 xa8 x8e x8e x82 xb5 x95 xdb x8c xaf x82 xc5 x8fo x89 xef x82 xa2 x8am x92 xe8 x81 xa1    shift jis    default charset    UTF-8   can also try  ascii  or other unicode type print    join   unicode lin 0   lin 1  or default charset  for lin in data head

User · Answer

If you are working on a remote host  look at  etc ssh ssh config on your local PC   When this file contains a line   SendEnv LANG LC     comment it out with adding   at the head of line  It might help   With this line  ssh sends language related environment variables of your PC to the remote host  It causes a lot of problems

User · Answer

No problems with my terminal  The above answers helped me looking in the right directions but it didn t work for me until I added  ignore    fix encoding   lambda s  s decode  utf8    ignore     As indicated in the comment below  this may lead to undesired results  OTOH it also may just do the trick well enough to get things working and you don t care about losing some characters

User · Answer

It s fine to use the below code in the top of your script as Andrei Krasutski suggested   import sys reload sys  sys setdefaultencoding  utf-8     But I will suggest you to also add   - - coding  utf-8 -  line at very top of the script   Omitting it throws below error in my case when I try to execute basic py     python basic py   File  01 basic py   line 14 SyntaxError  Non-ASCII character   xd9  in file basic py on line 14  but no encoding declared  see http   python org dev peps pep-0263  for details   The following is the code present in basic py which throws above error   code with error  from pylatex import Document  Section  Subsection  Command  Package from pylatex utils import italic  NoEscape  import sys reload sys  sys setdefaultencoding  utf-8    def fill document doc       with doc create Section                         doc append                                              doc append italic                                           with doc create Subsection                                  doc append                             amp         if   name         main           Basic document     doc   Document  basic       fill document doc    Then I added   - - coding  utf-8 - - line at very top and executed  It worked   code without error    - - coding  utf-8 - - from pylatex import Document  Section  Subsection  Command  Package from pylatex utils import italic  NoEscape  import sys reload sys  sys setdefaultencoding  utf-8    def fill document doc       with doc create Section                         doc append                                              doc append italic                                           with doc create Subsection                                  doc append                             amp         if   name         main           Basic document     doc   Document  basic       fill document doc    Thanks

User · Answer

My  1 to mata s comment at https   stackoverflow com a 10561979 1346705 and to the Nick Craig-Wood s demonstration   You have decoded the string correctly   The problem is with the print command as it converts the Unicode string to the console encoding  and the console is not capable to display the string   Try to write the string into a file and look at the result using some decent editor that supports Unicode   import codecs  s      xef xbd xa1 xef xbd xa5 xcf x89 xef xbd xa5 xef xbd xa1  xef xbe x89  s1   s decode  utf-8   f   codecs open  out txt    w   encoding  utf-8   f write s1  f close     Then you will see

User · Answer

Try setting the system default encoding as utf-8 at the start of the script  so that all strings are encoded using that    coding  utf-8 import sys reload sys  sys setdefaultencoding  utf-8

User · Answer

i solve that problem changing in the file settings py with  ENGINE    django db backends mysql     don  t use  ENGINE    mysql connector django

User · Answer

try   string decode  utf-8      or  unicode string   utf-8     edit      xef xbd xa1 xef xbd xa5 xcf x89 xef xbd xa5 xef xbd xa1  xef xbe x89  decode  utf-8   gives u   uff61 uff65 u03c9 uff65 uff61  uff89   which is correct   so your problem must be at some oter place  possibly if you try to do something with it were there is an implicit conversion going on  could be printing  writing to a stream      to say more we ll need to see some code

User · Answer

Just convert the text explicitly to string using str    Worked for me

[python] UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 1

Examples related to python

Examples related to unicode

Examples related to utf-8