[python] How to create a numpy array of arbitrary length strings?

I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str and keeping adding to it: str += "some stuff...". Is there a way to make an array of such strings?

When I try this, each element only stores a single character

strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
    strArr[i] = "test"

On the other hand, I know I can initialize an array of certain length strings, i.e.

strArr = numpy.empty(10, dtype='s256')

which can store 10 strings of up to 256 characters.

This question is related to python arrays string numpy

The answer is


You can do so by creating an array of dtype=object. If you try to assign a long string to a normal numpy array, it truncates the string:

>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'], 
      dtype='|S6')

But when you use dtype=object, you get an array of python object references. So you can have all the behaviors of python strings:

>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)

Indeed, because it's an array of objects, you can assign any kind of python object to the array:

>>> a[2] = {1:2, 3:4}
>>> a
array([apples, foobar, {1: 2, 3: 4}], dtype=object)

However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:

>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop

You could use the object data type:

>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to arrays

PHP array value passes to next row Use NSInteger as array index How do I show a message in the foreach loop? Objects are not valid as a React child. If you meant to render a collection of children, use an array instead Iterating over arrays in Python 3 Best way to "push" into C# array Sort Array of object by object field in Angular 6 Checking for duplicate strings in JavaScript array what does numpy ndarray shape do? How to round a numpy array?

Examples related to string

How to split a string in two and store it in a field String method cannot be found in a main class method Kotlin - How to correctly concatenate a String Replacing a character from a certain index Remove quotes from String in Python Detect whether a Python string is a number or a letter How does String substring work in Swift How does String.Index work in Swift swift 3.0 Data to String? How to parse JSON string in Typescript

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index