[python] Zip lists in Python

I am trying to learn how to "zip" lists. To this end, I have a program, where at a particular point, I do the following:

x1, x2, x3 = stuff.calculations(withdataa)

This gives me three lists, x1, x2, and x3, each of, say, size 20.

Now, I do:

zipall = zip(x1, x2, x3)

However, when I do:

print "len of zipall %s" % len(zipall)

I get 20, which is not what I expected. I expected three. I think I am doing something fundamentally wrong.

This question is related to python python-2.7

The answer is


In Python 3 zip returns an iterator instead and needs to be passed to a list function to get the zipped tuples:

x = [1, 2, 3]; y = ['a','b','c']
z = zip(x, y)
z = list(z)
print(z)
>>> [(1, 'a'), (2, 'b'), (3, 'c')]

Then to unzip them back just conjugate the zipped iterator:

x_back, y_back = zip(*z)
print(x_back); print(y_back)
>>> (1, 2, 3)
>>> ('a', 'b', 'c')

If the original form of list is needed instead of tuples:

x_back, y_back = zip(*z)
print(list(x_back)); print(list(y_back))
>>> [1,2,3]
>>> ['a','b','c']

I don't think zip returns a list. zip returns a generator. You have got to do list(zip(a, b)) to get a list of tuples.

x = [1, 2, 3]
y = [4, 5, 6]
zipped = zip(x, y)
list(zipped)

Source: My Blog Post (better formatting)

Example

numbers = [1,2,3]
letters = 'abcd'

zip(numbers, letters)
# [(1, 'a'), (2, 'b'), (3, 'c')]

Input

Zero or more iterables [1] (ex. list, string, tuple, dictionary)

Output (list)

1st tuple = (element_1 of numbers, element_1 of letters)

2nd tuple = (e_2 numbers, e_2 letters)

n-th tuple = (e_n numbers, e_n letters)

  1. List of n tuples: n is the length of the shortest argument (input)
    • len(numbers) == 3 < len(letters) == 4 ? short= 3 ? return 3 tuples
  2. Length each tuple = # of args (tuple takes an element from each arg)
    • args = (numbers,letters); len(args) == 2 ? tuple with 2 elements
  3. ith tuple = (element_i arg1, element_i arg2…, element_i argn)

Edge Cases

1) Empty String: len(str)= 0 = no tuples

2) Single String: len(str) == 2 tuples with len(args) == 1 element(s)

zip()
# []
zip('')
# []
zip('hi')
# [('h',), ('i',)]

Zip in Action!

1. Build a dictionary [2] out of two lists

keys = ["drink","band","food"]
values = ["La Croix", "Daft Punk", "Sushi"]

my_favorite = dict( zip(keys, values) )

my_favorite["drink"]
# 'La Croix'

my_faves = dict()
for i in range(len(keys)):
    my_faves[keys[i]] = values[i]
  • zip is an elegant, clear, & concise solution

2. Print columns in a table

"*" [3] is called "unpacking": f(*[arg1,arg2,arg3]) == f(arg1, arg2, arg3)

student_grades = [
[   'Morty'  ,  1   ,  "B"  ],
[   'Rick'   ,  4   ,  "A"  ],
[   'Jerry'  ,  3   ,  "M"  ],
[  'Kramer'  ,  0   ,  "F"  ],
]

row_1 = student_grades[0]
print row_1
# ['Morty', 1, 'B']

columns = zip(*student_grades)
names = columns[0]
print names
# ('Morty', 'Rick', 'Jerry', 'Kramer')

Extra Credit: Unzipping

zip(*args) is called “unzipping” because it has the inverse effect of zip

numbers = (1,2,3)
letters = ('a','b','c')

zipped = zip(numbers, letters)
print zipped
# [(1, 'a'), (2, 'b'), (3, 'c')]

unzipped = zip(*zipped)
print unzipped
# [(1, 2, 3), ('a', 'b', 'c')]
  • unzipped: tuple_1 = e1 of each zipped tuple. tuple_2 = e2 of each zipped

Footnotes

  1. An object capable of returning its members one at a time (ex. list [1,2,3], string 'I like codin', tuple (1,2,3), dictionary {'a':1, 'b':2})
  2. {key1:value1, key2:value2...}
  3. “Unpacking” (*)

* Code:

# foo - function, returns sum of two arguments
def foo(x,y):
    return x + y
print foo(3,4)
# 7

numbers = [1,2]
print foo(numbers)
# TypeError: foo() takes exactly 2 arguments (1 given)

print foo(*numbers)
# 3

* took numbers (1 arg) and “unpacked” its’ 2 elements into 2 args


zip takes a bunch of lists likes

a: a1 a2 a3 a4 a5 a6 a7...
b: b1 b2 b3 b4 b5 b6 b7...
c: c1 c2 c3 c4 c5 c6 c7...

and "zips" them into one list whose entries are 3-tuples (ai, bi, ci). Imagine drawing a zipper horizontally from left to right.


It's worth adding here as it is such a highly ranking question on zip. zip is great, idiomatic Python - but it doesn't scale very well at all for large lists.

Instead of:

books = ['AAAAAAA', 'BAAAAAAA', ... , 'ZZZZZZZ']
words = [345, 567, ... , 672]

for book, word in zip(books, words):
   print('{}: {}'.format(book, word))

Use izip. For modern processing, it stores it in L1 Cache memory and is far more performant for larger lists. Use it as simply as adding an i:

for book, word in izip(books, words):
   print('{}: {}'.format(book, word))

For the completeness's sake.

When zipped lists' lengths are not equal. The result list's length will become the shortest one without any error occurred

>>> a = [1]
>>> b = ["2", 3]
>>> zip(a,b)
[(1, '2')]

zip creates a new list, filled with tuples containing elements from the iterable arguments:

>>> zip ([1,2],[3,4])
[(1,3), (2,4)]

I expect what you try to so is create a tuple where each element is a list.


In Python 2.7 this might have worked fine:

>>> a = b = c = range(20)
>>> zip(a, b, c)

But in Python 3.4 it should be (otherwise, the result will be something like <zip object at 0x00000256124E7DC8>):

>>> a = b = c = range(20)
>>> list(zip(a, b, c))

Basically the zip function works on lists, tuples and dictionaries in Python. If you are using IPython then just type zip? And check what zip() is about.

If you are not using IPython then just install it: "pip install ipython"

For lists

a = ['a', 'b', 'c']
b = ['p', 'q', 'r']
zip(a, b)

The output is [('a', 'p'), ('b', 'q'), ('c', 'r')

For dictionary:

c = {'gaurav':'waghs', 'nilesh':'kashid', 'ramesh':'sawant', 'anu':'raje'}
d = {'amit':'wagh', 'swapnil':'dalavi', 'anish':'mane', 'raghu':'rokda'}
zip(c, d)

The output is:

[('gaurav', 'amit'),
 ('nilesh', 'swapnil'),
 ('ramesh', 'anish'),
 ('anu', 'raghu')]