[python] How to make unicode string with python3

I used this :

u = unicode(text, 'utf-8')

But getting error with Python 3 (or... maybe I just forgot to include something) :

NameError: global name 'unicode' is not defined

Thank you.

This question is related to python unicode python-3.x

The answer is


What's new in Python 3.0 says:

All text is Unicode; however encoded Unicode is represented as binary data

If you want to ensure you are outputting utf-8, here's an example from this page on unicode in 3.0:

b'\x80abc'.decode("utf-8", "strict")

In a Python 2 program that I used for many years there was this line:

ocd[i].namn=unicode(a[:b], 'utf-8')

This did not work in Python 3.

However, the program turned out to work with:

ocd[i].namn=a[:b]

I don't remember why I put unicode there in the first place, but I think it was because the name can contains Swedish letters åäöÅÄÖ. But even they work without "unicode".


As a workaround, I've been using this:

# Fix Python 2.x.
try:
    UNICODE_EXISTS = bool(type(unicode))
except NameError:
    unicode = lambda s: str(s)

the easiest way in python 3.x

text = "hi , I'm text"
text.encode('utf-8')

This how I solved my problem to convert chars like \uFE0F, \u000A, etc. And also emojis that encoded with 16 bytes.

example = 'raw vegan chocolate cocoa pie w chocolate & vanilla cream\\uD83D\\uDE0D\\uD83D\\uDE0D\\u2764\\uFE0F Present Moment Caf\\u00E8 in St.Augustine\\u2764\\uFE0F\\u2764\\uFE0F '
import codecs
new_str = codecs.unicode_escape_decode(example)[0]
print(new_str)
>>> 'raw vegan chocolate cocoa pie w chocolate & vanilla cream\ud83d\ude0d\ud83d\ude0d?? Present Moment Cafè in St.Augustine???? '
new_new_str = new_str.encode('utf-16', 'surrogatepass').decode('utf-16')
print(new_new_str)
>>> 'raw vegan chocolate cocoa pie w chocolate & vanilla cream?? Present Moment Cafè in St.Augustine???? '

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to unicode

How to resolve TypeError: can only concatenate str (not "int") to str (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape UnicodeEncodeError: 'ascii' codec can't encode character at special name Python NLTK: SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP) HTML for the Pause symbol in audio and video control Javascript: Unicode string to hex Concrete Javascript Regex for Accented Characters (Diacritics) Replace non-ASCII characters with a single space UTF-8 in Windows 7 CMD NameError: global name 'unicode' is not defined - in Python 3

Examples related to python-3.x

Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Replace specific text with a redacted version using Python Upgrade to python 3.8 using conda "Permission Denied" trying to run Python on Windows 10 Python: 'ModuleNotFoundError' when trying to import module from imported package What is the meaning of "Failed building wheel for X" in pip install? How to downgrade python from 3.7 to 3.6 I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? Iterating over arrays in Python 3 How to upgrade Python version to 3.7?