[python] Initializing a list to a known number of elements in Python

Right now I am using a list, and was expecting something like:

verts = list (1000)

Should I use array instead?

This question is related to python arrays list

The answer is


Wanting to initalize an array of fixed size is a perfectly acceptable thing to do in any programming language; it isn't like the programmer wants to put a break statement in a while(true) loop. Believe me, especially if the elements are just going to be overwritten and not merely added/subtracted, like is the case of many dynamic programming algorithms, you don't want to mess around with append statements and checking if the element hasn't been initialized yet on the fly (that's a lot of code gents).

object = [0 for x in range(1000)]

This will work for what the programmer is trying to achieve.


This:

 lst = [8 for i in range(9)]

creates a list, elements are initialized 8

but this:

lst = [0] * 7

would create 7 lists which have one element


Not quite sure why everyone is giving you a hard time for wanting to do this - there are several scenarios where you'd want a fixed size initialised list. And you've correctly deduced that arrays are sensible in these cases.

import array
verts=array.array('i',(0,)*1000)

For the non-pythonistas, the (0,)*1000 term is creating a tuple containing 1000 zeros. The comma forces python to recognise (0) as a tuple, otherwise it would be evaluated as 0.

I've used a tuple instead of a list because they are generally have lower overhead.


@Steve already gave a good answer to your question:

verts = [None] * 1000

Warning: As @Joachim Wuttke pointed out, the list must be initialized with an immutable element. [[]] * 1000 does not work as expected because you will get a list of 1000 identical lists (similar to a list of 1000 points to the same list in C). Immutable objects like int, str or tuple will do fine.

Alternatives

Resizing lists is slow. The following results are not very surprising:

>>> N = 10**6

>>> %timeit a = [None] * N
100 loops, best of 3: 7.41 ms per loop

>>> %timeit a = [None for x in xrange(N)]
10 loops, best of 3: 30 ms per loop

>>> %timeit a = [None for x in range(N)]
10 loops, best of 3: 67.7 ms per loop

>>> a = []
>>> %timeit for x in xrange(N): a.append(None)
10 loops, best of 3: 85.6 ms per loop

But resizing is not very slow if you don't have very large lists. Instead of initializing the list with a single element (e.g. None) and a fixed length to avoid list resizing, you should consider using list comprehensions and directly fill the list with correct values. For example:

>>> %timeit a = [x**2 for x in xrange(N)]
10 loops, best of 3: 109 ms per loop

>>> def fill_list1():
    """Not too bad, but complicated code"""
    a = [None] * N
    for x in xrange(N):
        a[x] = x**2
>>> %timeit fill_list1()
10 loops, best of 3: 126 ms per loop

>>> def fill_list2():
    """This is slow, use only for small lists"""
    a = []
    for x in xrange(N):
        a.append(x**2)
>>> %timeit fill_list2()
10 loops, best of 3: 177 ms per loop

Comparison to numpy

For huge data set numpy or other optimized libraries are much faster:

from numpy import ndarray, zeros
%timeit empty((N,))
1000000 loops, best of 3: 788 ns per loop

%timeit zeros((N,))
100 loops, best of 3: 3.56 ms per loop

You could do this:

verts = list(xrange(1000))

That would give you a list of 1000 elements in size and which happens to be initialised with values from 0-999. As list does a __len__ first to size the new list it should be fairly efficient.


You should consider using a dict type instead of pre-initialized list. The cost of a dictionary look-up is small and comparable to the cost of accessing arbitrary list element.

And when using a mapping you can write:

aDict = {}
aDict[100] = fetchElement()
putElement(fetchElement(), fetchPosition(), aDict)

And the putElement function can store item at any given position. And if you need to check if your collection contains element at given index it is more Pythonic to write:

if anIndex in aDict:
    print "cool!"

Than:

if not myList[anIndex] is None:
    print "cool!"

Since the latter assumes that no real element in your collection can be None. And if that happens - your code misbehaves.

And if you desperately need performance and that's why you try to pre-initialize your variables, and write the fastest code possible - change your language. The fastest code can't be written in Python. You should try C instead and implement wrappers to call your pre-initialized and pre-compiled code from Python.


Without knowing more about the problem domain, it's hard to answer your question. Unless you are certain that you need to do something more, the pythonic way to initialize a list is:

verts = []

Are you actually seeing a performance problem? If so, what is the performance bottleneck? Don't try to solve a problem that you don't have. It's likely that performance cost to dynamically fill an array to 1000 elements is completely irrelevant to the program that you're really trying to write.

The array class is useful if the things in your list are always going to be a specific primitive fixed-length type (e.g. char, int, float). But, it doesn't require pre-initialization either.


One obvious and probably not efficient way is

verts = [0 for x in range(1000)]

Note that this can be extended to 2-dimension easily. For example, to get a 10x100 "array" you can do

verts = [[0 for x in range(100)] for y in range(10)]

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to arrays

PHP array value passes to next row Use NSInteger as array index How do I show a message in the foreach loop? Objects are not valid as a React child. If you meant to render a collection of children, use an array instead Iterating over arrays in Python 3 Best way to "push" into C# array Sort Array of object by object field in Angular 6 Checking for duplicate strings in JavaScript array what does numpy ndarray shape do? How to round a numpy array?

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>