[python] How to len(generator())

Python generators are very useful. They have advantages over functions that return lists. However, you could len(list_returning_function()). Is there a way to len(generator_function())?

UPDATE:
Of course len(list(generator_function())) would work.....
I'm trying to use a generator I've created inside a new generator I'm creating. As part of the calculation in the new generator it needs to know the length of the old one. However I would like to keep both of them together with the same properties as a generator, specifically - not maintain the entire list in memory as it may be very long.

UPDATE 2:
Assume the generator knows it's target length even from the first step. Also, there's no reason to maintain the len() syntax. Example - if functions in Python are objects, couldn't I assign the length to a variable of this object that would be accessible to the new generator?

This question is related to python generator

The answer is


You can use reduce.

For Python 3:

>>> import functools
>>> def gen():
...     yield 1
...     yield 2
...     yield 3
...
>>> functools.reduce(lambda x,y: x + 1, gen(), 0)

In Python 2, reduce is in the global namespace so the import is unnecessary.


You can use send as a hack:

def counter():
    length = 10
    i = 0
    while i < length:
        val = (yield i)
        if val == 'length':
            yield length
        i += 1

it = counter()
print(it.next())
#0
print(it.next())
#1
print(it.send('length'))
#10
print(it.next())
#2
print(it.next())
#3

You can use len(list(generator_function()). However, this consumes the generator, but that's the only way you can find out how many elements are generated. So you may want to save the list somewhere if you also want to use the items.

a = list(generator_function())
print(len(a))
print(a[0])

You can combine the benefits of generators with the certainty of len(), by creating your own iterable object:

class MyIterable(object):
    def __init__(self, n):
        self.n = n

    def __len__(self):
        return self.n

    def __iter__(self):
        self._gen = self._generator()
        return self

    def _generator(self):
        # Put your generator code here
        i = 0
        while i < self.n:
            yield i
            i += 1

    def next(self):
        return next(self._gen)

mi = MyIterable(100)
print len(mi)
for i in mi:
    print i,

This is basically a simple implementation of xrange, which returns an object you can take the len of, but doesn't create an explicit list.


The conversion to list that's been suggested in the other answers is the best way if you still want to process the generator elements afterwards, but has one flaw: It uses O(n) memory. You can count the elements in a generator without using that much memory with:

sum(1 for x in generator)

Of course, be aware that this might be slower than len(list(generator)) in common Python implementations, and if the generators are long enough for the memory complexity to matter, the operation would take quite some time. Still, I personally prefer this solution as it describes what I want to get, and it doesn't give me anything extra that's not required (such as a list of all the elements).

Also listen to delnan's advice: If you're discarding the output of the generator it is very likely that there is a way to calculate the number of elements without running it, or by counting them in another manner.


You can len(list(generator)) but you could probably make something more efficient if you really intend to discard the results.


Suppose we have a generator:

def gen():
    for i in range(10):
        yield i

We can wrap the generator, along with the known length, in an object:

import itertools
class LenGen(object):
    def __init__(self,gen,length):
        self.gen=gen
        self.length=length
    def __call__(self):
        return itertools.islice(self.gen(),self.length)
    def __len__(self):
        return self.length

lgen=LenGen(gen,10)

Instances of LenGen are generators themselves, since calling them returns an iterator.

Now we can use the lgen generator in place of gen, and access len(lgen) as well:

def new_gen():
    for i in lgen():
        yield float(i)/len(lgen)

for i in new_gen():
    print(i)