[python] How to search a list of tuples in Python

So I have a list of tuples such as this:

[(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

I want this list for a tuple whose number value is equal to something.

So that if I do search(53) it will return the index value of 2

Is there an easy way to do this?

This question is related to python search list tuples

The answer is


[i for i, v in enumerate(L) if v[0] == 53]

Your tuples are basically key-value pairs--a python dict--so:

l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]
val = dict(l)[53]

Edit -- aha, you say you want the index value of (53, "xuxa"). If this is really what you want, you'll have to iterate through the original list, or perhaps make a more complicated dictionary:

d = dict((n,i) for (i,n) in enumerate(e[0] for e in l))
idx = d[53]

tl;dr

A generator expression is probably the most performant and simple solution to your problem:

l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

result = next((i for i, v in enumerate(l) if v[0] == 53), None)
# 2

Explanation

There are several answers that provide a simple solution to this question with list comprehensions. While these answers are perfectly correct, they are not optimal. Depending on your use case, there may be significant benefits to making a few simple modifications.

The main problem I see with using a list comprehension for this use case is that the entire list will be processed, although you only want to find 1 element.

Python provides a simple construct which is ideal here. It is called the generator expression. Here is an example:

# Our input list, same as before
l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

# Call next on our generator expression.
next((i for i, v in enumerate(l) if v[0] == 53), None)

We can expect this method to perform basically the same as list comprehensions in our trivial example, but what if we're working with a larger data set? That's where the advantage of using the generator method comes into play. Rather than constructing a new list, we'll use your existing list as our iterable, and use next() to get the first item from our generator.

Lets look at how these methods perform differently on some larger data sets. These are large lists, made of 10000000 + 1 elements, with our target at the beginning (best) or end (worst). We can verify that both of these lists will perform equally using the following list comprehension:

List comprehensions

"Worst case"

worst_case = ([(False, 'F')] * 10000000) + [(True, 'T')]
print [i for i, v in enumerate(worst_case) if v[0] is True]

# [10000000]
#          2 function calls in 3.885 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    3.885    3.885    3.885    3.885 so_lc.py:1(<module>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

"Best case"

best_case = [(True, 'T')] + ([(False, 'F')] * 10000000)
print [i for i, v in enumerate(best_case) if v[0] is True]

# [0]
#          2 function calls in 3.864 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    3.864    3.864    3.864    3.864 so_lc.py:1(<module>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Generator expressions

Here's my hypothesis for generators: we'll see that generators will significantly perform better in the best case, but similarly in the worst case. This performance gain is mostly due to the fact that the generator is evaluated lazily, meaning it will only compute what is required to yield a value.

Worst case

# 10000000
#          5 function calls in 1.733 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         2    1.455    0.727    1.455    0.727 so_lc.py:10(<genexpr>)
#         1    0.278    0.278    1.733    1.733 so_lc.py:9(<module>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
#         1    0.000    0.000    1.455    1.455 {next}

Best case

best_case  = [(True, 'T')] + ([(False, 'F')] * 10000000)
print next((i for i, v in enumerate(best_case) if v[0] == True), None)

# 0
#          5 function calls in 0.316 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    0.316    0.316    0.316    0.316 so_lc.py:6(<module>)
#         2    0.000    0.000    0.000    0.000 so_lc.py:7(<genexpr>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
#         1    0.000    0.000    0.000    0.000 {next}

WHAT?! The best case blows away the list comprehensions, but I wasn't expecting the our worst case to outperform the list comprehensions to such an extent. How is that? Frankly, I could only speculate without further research.

Take all of this with a grain of salt, I have not run any robust profiling here, just some very basic testing. This should be sufficient to appreciate that a generator expression is more performant for this type of list searching.

Note that this is all basic, built-in python. We don't need to import anything or use any libraries.

I first saw this technique for searching in the Udacity cs212 course with Peter Norvig.


You can use a list comprehension:

>>> a = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]
>>> [x[0] for x in a]
[1, 22, 53, 44]
>>> [x[0] for x in a].index(53)
2

[k for k,v in l if v =='delicia']

here l is the list of tuples-[(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

And instead of converting it to a dict, we are using llist comprehension.

*Key* in Key,Value in list, where value = **delicia**


Hmm... well, the simple way that comes to mind is to convert it to a dict

d = dict(thelist)

and access d[53].

EDIT: Oops, misread your question the first time. It sounds like you actually want to get the index where a given number is stored. In that case, try

dict((t[0], i) for i, t in enumerate(thelist))

instead of a plain old dict conversion. Then d[53] would be 2.


Just another way.

zip(*a)[0].index(53)

Supposing the list may be long and the numbers may repeat, consider using the SortedList type from the Python sortedcontainers module. The SortedList type will automatically maintain the tuples in order by number and allow for fast searching.

For example:

from sortedcontainers import SortedList
sl = SortedList([(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")])

# Get the index of 53:

index = sl.bisect((53,))

# With the index, get the tuple:

tup = sl[index]

This will work a lot faster than the list comprehension suggestion by doing a binary search. The dictionary suggestion will be faster still but won't work if there could be duplicate numbers with different strings.

If there are duplicate numbers with different strings then you need to take one more step:

end = sl.bisect((53 + 1,))

results = sl[index:end]

By bisecting for 54, we will find the end index for our slice. This will be significantly faster on long lists as compared with the accepted answer.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Find a file by name in Visual Studio Code Search all the occurrences of a string in the entire project in Android Studio Java List.contains(Object with field value equal to x) Trigger an action after selection select2 How can I search for a commit message on GitHub? SQL search multiple values in same field Find a string by searching all tables in SQL Server Management Studio 2008 Search File And Find Exact Match And Print Line? Java - Search for files in a directory How to put a delay on AngularJS instant search?

Examples related to list

Convert List to Pandas Dataframe Column Python find elements in one list that are not in the other Sorting a list with stream.sorted() in Java Python Loop: List Index Out of Range How to combine two lists in R How do I multiply each element in a list by a number? Save a list to a .txt file The most efficient way to remove first N elements in a list? TypeError: list indices must be integers or slices, not str Parse JSON String into List<string>

Examples related to tuples

Append a tuple to a list - what's the difference between two ways? How can I access each element of a pair in a pair list? pop/remove items out of a python tuple Python convert tuple to string django: TypeError: 'tuple' object is not callable Why is there no tuple comprehension in Python? Python add item to the tuple Converting string to tuple without splitting characters Convert tuple to list and back How to form tuple column from two columns in Pandas