How can I check if a Python object is a string (either regular or Unicode)?
This question is related to
python
string
types
compatibility
For a nice duck-typing approach for string-likes that has the bonus of working with both Python 2.x and 3.x:
def is_string(obj):
try:
obj + ''
return True
except TypeError:
return False
wisefish was close with the duck-typing before he switched to the isinstance
approach, except that +=
has a different meaning for lists than +
does.
In Python 3.x basestring
is not available anymore, as str
is the sole string type (with the semantics of Python 2.x's unicode
).
So the check in Python 3.x is just:
isinstance(obj_to_test, str)
This follows the fix of the official 2to3
conversion tool: converting basestring
to str
.
You can test it by concatenating with an empty string:
def is_string(s):
try:
s += ''
except:
return False
return True
Edit:
Correcting my answer after comments pointing out that this fails with lists
def is_string(s):
return isinstance(s, basestring)
In order to check if your variable is something you could go like:
s='Hello World'
if isinstance(s,str):
#do something here,
The output of isistance will give you a boolean True or False value so you can adjust accordingly. You can check the expected acronym of your value by initially using: type(s) This will return you type 'str' so you can use it in the isistance function.
If one wants to stay away from explicit type-checking (and there are good reasons to stay away from it), probably the safest part of the string protocol to check is:
str(maybe_string) == maybe_string
It won't iterate through an iterable or iterator, it won't call a list-of-strings a string and it correctly detects a stringlike as a string.
Of course there are drawbacks. For example, str(maybe_string)
may be a heavy calculation. As so often, the answer is it depends.
EDIT: As @Tcll points out in the comments, the question actually asks for a way to detect both unicode strings and bytestrings. On Python 2 this answer will fail with an exception for unicode strings that contain non-ASCII characters, and on Python 3 it will return False
for all bytestrings.
I might deal with this in the duck-typing style, like others mention. How do I know a string is really a string? well, obviously by converting it to a string!
def myfunc(word):
word = unicode(word)
...
If the arg is already a string or unicode type, real_word will hold its value unmodified. If the object passed implements a __unicode__
method, that is used to get its unicode representation. If the object passed cannot be used as a string, the unicode
builtin raises an exception.
To check if an object o
is a string type of a subclass of a string type:
isinstance(o, basestring)
because both str
and unicode
are subclasses of basestring
.
To check if the type of o
is exactly str
:
type(o) is str
To check if o
is an instance of str
or any subclass of str
:
isinstance(o, str)
The above also work for Unicode strings if you replace str
with unicode
.
However, you may not need to do explicit type checking at all. "Duck typing" may fit your needs. See http://docs.python.org/glossary.html#term-duck-typing.
See also What’s the canonical way to check for type in python?
if type(varA) == str or type(varB) == str:
print 'string involved'
from EDX - online course MITx: 6.00.1x Introduction to Computer Science and Programming Using Python
This evening I ran into a situation in which I thought I was going to have to check against the str
type, but it turned out I did not.
My approach to solving the problem will probably work in many situations, so I offer it below in case others reading this question are interested (Python 3 only).
# NOTE: fields is an object that COULD be any number of things, including:
# - a single string-like object
# - a string-like object that needs to be converted to a sequence of
# string-like objects at some separator, sep
# - a sequence of string-like objects
def getfields(*fields, sep=' ', validator=lambda f: True):
'''Take a field sequence definition and yield from a validated
field sequence. Accepts a string, a string with separators,
or a sequence of strings'''
if fields:
try:
# single unpack in the case of a single argument
fieldseq, = fields
try:
# convert to string sequence if string
fieldseq = fieldseq.split(sep)
except AttributeError:
# not a string; assume other iterable
pass
except ValueError:
# not a single argument and not a string
fieldseq = fields
invalid_fields = [field for field in fieldseq if not validator(field)]
if invalid_fields:
raise ValueError('One or more field names is invalid:\n'
'{!r}'.format(invalid_fields))
else:
raise ValueError('No fields were provided')
try:
yield from fieldseq
except TypeError as e:
raise ValueError('Single field argument must be a string'
'or an interable') from e
Some tests:
from . import getfields
def test_getfields_novalidation():
result = ['a', 'b']
assert list(getfields('a b')) == result
assert list(getfields('a,b', sep=',')) == result
assert list(getfields('a', 'b')) == result
assert list(getfields(['a', 'b'])) == result
I found this ans more pythonic
:
if type(aObject) is str:
#do your stuff here
pass
since type objects are singleton, is can be used to do the compare the object to the str type
isinstance(your_object, basestring)
will be True if your object is indeed a string-type. 'str' is reserved word.
my apologies, the correct answer is using 'basestring' instead of 'str' in order of it to include unicode strings as well - as been noted above by one of the other responders.
If you want to check with no regard for Python version (2.x vs 3.x), use six
(PyPI) and its string_types
attribute:
import six
if isinstance(obj, six.string_types):
print('obj is a string!')
Within six
(a very light-weight single-file module), it's simply doing this:
import sys
PY3 = sys.version_info[0] == 3
if PY3:
string_types = str
else:
string_types = basestring
Its simple, use the following code (we assume the object mentioned to be obj)-
if type(obj) == str:
print('It is a string')
else:
print('It is not a string.')
Source: Stackoverflow.com