I have a list of values which I need to filter given the values in a list of booleans:
list_a = [1, 2, 4, 6]
filter = [True, False, True, False]
I generate a new filtered list with the following line:
filtered_list = [i for indx,i in enumerate(list_a) if filter[indx] == True]
which results in:
print filtered_list
[1,4]
The line works but looks (to me) a bit overkill and I was wondering if there was a simpler way to achieve the same.
Summary of two good advices given in the answers below:
1- Don't name a list filter
like I did because it is a built-in function.
2- Don't compare things to True
like I did with if filter[idx]==True..
since it's unnecessary. Just using if filter[idx]
is enough.
Like so:
filtered_list = [i for (i, v) in zip(list_a, filter) if v]
Using zip
is the pythonic way to iterate over multiple sequences in parallel, without needing any indexing. This assumes both sequences have the same length (zip stops after the shortest runs out). Using itertools
for such a simple case is a bit overkill ...
One thing you do in your example you should really stop doing is comparing things to True, this is usually not necessary. Instead of if filter[idx]==True: ...
, you can simply write if filter[idx]: ...
.
filtered_list = [list_a[i] for i in range(len(list_a)) if filter[i]]
To do this using numpy, ie, if you have an array, a
, instead of list_a
:
a = np.array([1, 2, 4, 6])
my_filter = np.array([True, False, True, False], dtype=bool)
a[my_filter]
> array([1, 4])
With python 3 you can use list_a[filter]
to get True
values. To get False
values use list_a[~filter]
With numpy:
In [128]: list_a = np.array([1, 2, 4, 6])
In [129]: filter = np.array([True, False, True, False])
In [130]: list_a[filter]
Out[130]: array([1, 4])
or see Alex Szatmary's answer if list_a can be a numpy array but not filter
Numpy usually gives you a big speed boost as well
In [133]: list_a = [1, 2, 4, 6]*10000
In [134]: fil = [True, False, True, False]*10000
In [135]: list_a_np = np.array(list_a)
In [136]: fil_np = np.array(fil)
In [139]: %timeit list(itertools.compress(list_a, fil))
1000 loops, best of 3: 625 us per loop
In [140]: %timeit list_a_np[fil_np]
10000 loops, best of 3: 173 us per loop
Source: Stackoverflow.com