[python] How to find all occurrences of an element in a list

index() will give the first occurrence of an item in a list. Is there a neat trick which returns all indices in a list for an element?

This question is related to python list

The answer is


You can use a list comprehension:

indices = [i for i, x in enumerate(my_list) if x == "whatever"]

Getting all the occurrences and the position of one or more (identical) items in a list

With enumerate(alist) you can store the first element (n) that is the index of the list when the element x is equal to what you look for.

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

Let's make our function findindex

This function takes the item and the list as arguments and return the position of the item in the list, like we saw before.

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

Output


[1, 3, 5, 7]

Simple

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

Output:

0
4

How about:

In [1]: l=[1,2,3,4,3,2,5,6,7]

In [2]: [i for i,val in enumerate(l) if val==3]
Out[2]: [2, 4]

import numpy as np
import random  # to create test list

# create sample list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(20)]

# convert the list to an array for use with these numpy methods
a = np.array(l)

# create a dict of each unique entry and the associated indices
idx = {v: np.where(a == v)[0].tolist() for v in np.unique(a)}

# print(idx)
{'s1': [7, 9, 10, 11, 17],
 's2': [1, 3, 6, 8, 14, 18, 19],
 's3': [0, 2, 13, 16],
 's4': [4, 5, 12, 15]}

# find a single element with 
idx = np.where(a == 's1')

print(idx)
[out]:
(array([ 7,  9, 10, 11, 17], dtype=int64),)

%timeit

  • Find indices of a single element in a 2M element list with 4 unique elements
# create 2M element list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(2000000)]

# create array
a = np.array(l)

# np.where
%timeit np.where(a == 's1')
[out]:
25.9 ms ± 827 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# list-comprehension
%timeit [i for i, x in enumerate(l) if x == "s1"]
[out]:
175 ms ± 2.73 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Here is a time performance comparison between using np.where vs list_comprehension. Seems like np.where is faster on average.

# np.where
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = np.where(temp_list==3)[0].tolist()
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))
Took on average 3.81469726562e-06 seconds
# list_comprehension
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = [i for i in range(len(temp_list)) if temp_list[i]==3]
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))
Took on average 4.05311584473e-06 seconds

One more solution(sorry if duplicates) for all occurrences:

values = [1,2,3,1,2,4,5,6,3,2,1]
map(lambda val: (val, [i for i in xrange(len(values)) if values[i] == val]), values)

While not a solution for lists directly, numpy really shines for this sort of thing:

import numpy as np
values = np.array([1,2,3,1,2,4,5,6,3,2,1])
searchval = 3
ii = np.where(values == searchval)[0]

returns:

ii ==>array([2, 8])

This can be significantly faster for lists (arrays) with a large number of elements vs some of the other solutions.


If you need to search for all element's positions between certain indices, you can state them:

[i for i,x in enumerate([1,2,3,2]) if x==2 & 2<= i <=3] # -> [3]

If you are using Python 2, you can achieve the same functionality with this:

f = lambda my_list, value:filter(lambda x: my_list[x] == value, range(len(my_list)))

Where my_list is the list you want to get the indexes of, and value is the value searched. Usage:

f(some_list, some_element)

Using a for-loop:

  • Answers with enumerate and a list comprehension are more pythonic, not necessarily faster, however, this answer is aimed at students who may not be allowed to use some of those built-in functions.
  • create an empty list, indices
  • create the loop with for i in range(len(x)):, which essentially iterates through a list of index locations [0, 1, 2, 3, ..., len(x)-1]
  • in the loop, add any i, where x[i] is a match to value, to indices
def get_indices(x: list, value: int) -> list:
    indices = list()
    for i in range(len(x)):
        if x[i] == value:
            indices.append(i)
    return indices

n = [1, 2, 3, -50, -60, 0, 6, 9, -60, -60]
print(get_indices(n, -60))

>>> [4, 8, 9]
  • The functions, get_indices, are implemented with type hints. In this case, the list, n, is a bunch of ints, therefore we search for value, also defined as an int.

Using a while-loop and .index:

  • With .index, use try-except for error handling, because a ValueError will occur if value is not in the list.
def get_indices(x: list, value: int) -> list:
    indices = list()
    i = 0
    while True:
        try:
            # find an occurrence of value and update i to that index
            i = x.index(value, i)
            # add i to the list
            indices.append(i)
            # advance i by 1
            i += 1
        except ValueError as e:
            break
    return indices

print(get_indices(n, -60))
>>> [4, 8, 9]

more_itertools.locate finds indices for all items that satisfy a condition.

from more_itertools import locate


list(locate([0, 1, 1, 0, 1, 0, 0]))
# [1, 2, 4]

list(locate(['a', 'b', 'c', 'b'], lambda x: x == 'b'))
# [1, 3]

more_itertools is a third-party library > pip install more_itertools.


Using filter() in python2.

>>> q = ['Yeehaw', 'Yeehaw', 'Googol', 'B9', 'Googol', 'NSM', 'B9', 'NSM', 'Dont Ask', 'Googol']
>>> filter(lambda i: q[i]=="Googol", range(len(q)))
[2, 4, 9]

Or Use range (python 3):

l=[i for i in range(len(lst)) if lst[i]=='something...']

For (python 2):

l=[i for i in xrange(len(lst)) if lst[i]=='something...']

And then (both cases):

print(l)

Is as expected.


You can create a defaultdict

from collections import defaultdict
d1 = defaultdict(int)      # defaults to 0 values for keys
unq = set(lst1)              # lst1 = [1, 2, 2, 3, 4, 1, 2, 7]
for each in unq:
      d1[each] = lst1.count(each)
else:
      print(d1)

occurrences = lambda s, lst: (i for i,e in enumerate(lst) if e == s)
list(occurrences(1, [1,2,3,1])) # = [0, 3]

A solution using list.index:

def indices(lst, element):
    result = []
    offset = -1
    while True:
        try:
            offset = lst.index(element, offset+1)
        except ValueError:
            return result
        result.append(offset)

It's much faster than the list comprehension with enumerate, for large lists. It is also much slower than the numpy solution if you already have the array, otherwise the cost of converting outweighs the speed gain (tested on integer lists with 100, 1000 and 10000 elements).

NOTE: A note of caution based on Chris_Rands' comment: this solution is faster than the list comprehension if the results are sufficiently sparse, but if the list has many instances of the element that is being searched (more than ~15% of the list, on a test with a list of 1000 integers), the list comprehension is faster.