[python] Filtering a list of strings based on contents

Given the list ['a','ab','abc','bac'], I want to compute a list with strings that have 'ab' in them. I.e. the result is ['ab','abc']. How can this be done in Python?

This question is related to python list

The answer is


This simple filtering can be achieved in many ways with Python. The best approach is to use "list comprehensions" as follows:

>>> lst = ['a', 'ab', 'abc', 'bac']
>>> [k for k in lst if 'ab' in k]
['ab', 'abc']

Another way is to use the filter function. In Python 2:

>>> filter(lambda k: 'ab' in k, lst)
['ab', 'abc']

In Python 3, it returns an iterator instead of a list, but you can cast it:

>>> list(filter(lambda k: 'ab' in k, lst))
['ab', 'abc']

Though it's better practice to use a comprehension.


[x for x in L if 'ab' in x]

Tried this out quickly in the interactive shell:

>>> l = ['a', 'ab', 'abc', 'bac']
>>> [x for x in l if 'ab' in x]
['ab', 'abc']
>>>

Why does this work? Because the in operator is defined for strings to mean: "is substring of".

Also, you might want to consider writing out the loop as opposed to using the list comprehension syntax used above:

l = ['a', 'ab', 'abc', 'bac']
result = []
for s in l:
   if 'ab' in s:
       result.append(s)

# To support matches from the beginning, not any matches:

items = ['a', 'ab', 'abc', 'bac']
prefix = 'ab'

filter(lambda x: x.startswith(prefix), items)

mylist = ['a', 'ab', 'abc']
assert 'ab' in mylist