# [python] Rolling or sliding window iterator?

I need a rolling window (aka sliding window) iterable over a sequence/iterator/generator. Default Python iteration can be considered a special case, where the window length is 1. I'm currently using the following code. Does anyone have a more Pythonic, less verbose, or more efficient method for doing this?

``````def rolling_window(seq, window_size):
it = iter(seq)
win = [it.next() for cnt in xrange(window_size)] # First window
yield win
for e in it: # Subsequent windows
win[:-1] = win[1:]
win[-1] = e
yield win

if __name__=="__main__":
for w in rolling_window(xrange(6), 3):
print w

"""Example output:

[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
"""
``````

This question is related to `python` `algorithm`

## The answer is

``````#Importing the numpy library
import numpy as np
arr = np.arange(6) #Sequence
window_size = 3
np.lib.stride_tricks.as_strided(arr, shape= (len(arr) - window_size +1, window_size),
strides = arr.strides*2)

"""Example output:

[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
``````

"""

This seems tailor-made for a `collections.deque` since you essentially have a FIFO (add to one end, remove from the other). However, even if you use a `list` you shouldn't be slicing twice; instead, you should probably just `pop(0)` from the list and `append()` the new item.

Here is an optimized deque-based implementation patterned after your original:

``````from collections import deque

def window(seq, n=2):
it = iter(seq)
win = deque((next(it, None) for _ in xrange(n)), maxlen=n)
yield win
append = win.append
for e in it:
append(e)
yield win
``````

In my tests it handily beats everything else posted here most of the time, though pillmuncher's `tee` version beats it for large iterables and small windows. On larger windows, the `deque` pulls ahead again in raw speed.

Access to individual items in the `deque` may be faster or slower than with lists or tuples. (Items near the beginning are faster, or items near the end if you use a negative index.) I put a `sum(w)` in the body of my loop; this plays to the deque's strength (iterating from one item to the next is fast, so this loop ran a a full 20% faster than the next fastest method, pillmuncher's). When I changed it to individually look up and add items in a window of ten, the tables turned and the `tee` method was 20% faster. I was able to recover some speed by using negative indexes for the last five terms in the addition, but `tee` was still a little faster. Overall I would estimate that either one is plenty fast for most uses and if you need a little more performance, profile and pick the one that works best.

Let's make it lazy!

``````from itertools import islice, tee

def window(iterable, size):
iterators = tee(iterable, size)
iterators = [islice(iterator, i, None) for i, iterator in enumerate(iterators)]
yield from zip(*iterators)

list(window(range(5), 3))
# [(0, 1, 2), (1, 2, 3), (2, 3, 4)]
``````

Optimized Function for sliding window data in Deep learning

``````def SlidingWindow(X, window_length, stride):
indexer = np.arange(window_length)[None, :] + stride*np.arange(int(len(X)/stride)-window_length+4)[:, None]
return X.take(indexer)
``````

to apply on multidimensional array

``````import numpy as np
def SlidingWindow(X, window_length, stride1):
stride=  X.shape[1]*stride1
window_length = window_length*X.shape[1]
indexer = np.arange(window_length)[None, :] + stride1*np.arange(int(len(X)/stride1)-window_length-1)[:, None]
return X.take(indexer)
``````

a slightly modified version of the deque window, to make it a true rolling window. So that it starts being populated with just one element, then grows to it's maximum window size, and then shrinks as it's left edge comes near the end:

``````from collections import deque
def window(seq, n=2):
it = iter(seq)
win = deque((next(it, None) for _ in xrange(1)), maxlen=n)
yield win
append = win.append
for e in it:
append(e)
yield win
for _ in xrange(len(win)-1):
win.popleft()
yield win

for wnd in window(range(5), n=3):
print(list(wnd))
``````

this gives

``````[0]
[0, 1]
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4]
[4]
``````

``````def rolling_window(list, degree):
for i in range(len(list)-degree+1):
yield [list[i+o] for o in range(degree)]
``````

Made this for a rolling average function

There is a library which does exactly what you need:

``````import more_itertools
list(more_itertools.windowed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],n=3, step=3))

Out: [(1, 2, 3), (4, 5, 6), (7, 8, 9), (10, 11, 12), (13, 14, 15)]
``````

Modified DiPaolo's answer to allow arbitrary fill and variable step size

``````import itertools
def window(seq, n=2,step=1,fill=None,keep=0):
"Returns a sliding window (of width n) over data from the iterable"
"   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
it = iter(seq)
result = tuple(itertools.islice(it, n))
if len(result) == n:
yield result
while True:
#         for elem in it:
elem = tuple( next(it, fill) for _ in range(step))
result = result[step:] + elem
if elem[-1] is fill:
if keep:
yield result
break
yield result
``````

How about using the following:

``````mylist = [1, 2, 3, 4, 5, 6, 7]

def sliding_window(l, window_size=2):
if window_size > len(l):
raise ValueError("Window size must be smaller or equal to the number of elements in the list.")

t = []
for i in xrange(0, window_size):
t.append(l[i:])

return zip(*t)

print sliding_window(mylist, 3)
``````

Output:

``````[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7)]
``````

This is an old question but for those still interested there is a great implementation of a window slider using generators in this page (by Adrian Rosebrock).

It is an implementation for OpenCV however you can easily use it for any other purpose. For the eager ones i'll paste the code here but to understand it better I recommend visiting the original page.

``````def sliding_window(image, stepSize, windowSize):
# slide a window across the image
for y in xrange(0, image.shape[0], stepSize):
for x in xrange(0, image.shape[1], stepSize):
# yield the current window
yield (x, y, image[y:y + windowSize[1], x:x + windowSize[0]])
``````

Tip: You can check the `.shape` of the window when iterating the generator to discard those that do not meet your requirements

Cheers

``````>>> n, m = 6, 3
>>> k = n - m+1
>>> print ('{}\n'*(k)).format(*[range(i, i+m) for i in xrange(k)])
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
``````

I like `tee()`:

``````from itertools import tee, izip

def window(iterable, size):
iters = tee(iterable, size)
for i in xrange(1, size):
for each in iters[i:]:
next(each, None)
return izip(*iters)

for each in window(xrange(6), 3):
print list(each)
``````

gives:

``````[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
``````

#### Multiple iterators!

``````def window(seq, size, step=1):
# initialize iterators
iters = [iter(seq) for i in range(size)]
# stagger iterators (without yielding)
[next(iters[i]) for j in range(size) for i in range(-1, -j-1, -1)]
while(True):
yield [next(i) for i in iters]
# next line does nothing for step = 1 (skips iterations for step > 1)
[next(i) for i in iters for j in range(step-1)]
``````

`next(it)` raises `StopIteration` when the sequence is finished, and for some cool reason that's beyond me, the yield statement here excepts it and the function returns, ignoring the leftover values that don't form a full window.

Anyway, this is the least-lines solution yet whose only requirement is that `seq` implement either `__iter__` or `__getitem__` and doesn't rely on `itertools` or `collections` besides @dansalmo's solution :)

I use the following code as a simple sliding window that uses generators to drastically increase readability. Its speed has so far been sufficient for use in bioinformatics sequence analysis in my experience.

I include it here because I didn't see this method used yet. Again, I make no claims about its compared performance.

``````def slidingWindow(sequence,winSize,step=1):
"""Returns a generator that will iterate through
the defined chunks of input sequence. Input sequence
must be sliceable."""

# Verify the inputs
if not ((type(winSize) == type(0)) and (type(step) == type(0))):
raise Exception("**ERROR** type(winSize) and type(step) must be int.")
if step > winSize:
raise Exception("**ERROR** step must not be larger than winSize.")
if winSize > len(sequence):
raise Exception("**ERROR** winSize must not be larger than sequence length.")

# Pre-compute number of chunks to emit
numOfChunks = ((len(sequence)-winSize)/step)+1

# Do the work
for i in range(0,numOfChunks*step,step):
yield sequence[i:i+winSize]
``````

Just a quick contribution.

Since the current python docs don't have "window" in the itertool examples (i.e., at the bottom of http://docs.python.org/library/itertools.html), here's an snippet based on the code for grouper which is one of the examples given:

``````import itertools as it
def window(iterable, size):
shiftedStarts = [it.islice(iterable, s, None) for s in xrange(size)]
return it.izip(*shiftedStarts)
``````

Basically, we create a series of sliced iterators, each with a starting point one spot further forward. Then, we zip these together. Note, this function returns a generator (it is not directly a generator itself).

Much like the appending-element and advancing-iterator versions above, the performance (i.e., which is best) varies with list size and window size. I like this one because it is a two-liner (it could be a one-liner, but I prefer naming concepts).

It turns out that the above code is wrong. It works if the parameter passed to iterable is a sequence but not if it is an iterator. If it is an iterator, the same iterator is shared (but not tee'd) among the islice calls and this breaks things badly.

Here is some fixed code:

``````import itertools as it
def window(iterable, size):
itrs = it.tee(iterable, size)
shiftedStarts = [it.islice(anItr, s, None) for s, anItr in enumerate(itrs)]
return it.izip(*shiftedStarts)
``````

Also, one more version for the books. Instead of copying an iterator and then advancing copies many times, this version makes pairwise copies of each iterator as we move the starting position forward. Thus, iterator t provides both the "complete" iterator with starting point at t and also the basis for creating iterator t + 1:

``````import itertools as it
def window4(iterable, size):
complete_itr, incomplete_itr = it.tee(iterable, 2)
iters = [complete_itr]
for i in xrange(1, size):
incomplete_itr.next()
complete_itr, incomplete_itr = it.tee(incomplete_itr, 2)
iters.append(complete_itr)
return it.izip(*iters)
``````

here is a one liner. I timed it and it's comprable to the performance of the top answer and gets progressively better with larger seq from 20% slower with len(seq) = 20 and 7% slower with len(seq) = 10000

``````zip(*[seq[i:(len(seq) - n - 1 + i)] for i in range(n)])
``````

Trying my part, simple, one liner, pythonic way using islice. But, may not be optimally efficient.

``````from itertools import islice
array = range(0, 10)
window_size = 4
map(lambda i: list(islice(array, i, i + window_size)), range(0, len(array) - window_size + 1))
# output = [[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5], [3, 4, 5, 6], [4, 5, 6, 7], [5, 6, 7, 8], [6, 7, 8, 9]]
``````

Explanation: Create window by using islice of window_size and iterate this operation using map over all array.

why not

``````def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)
``````

It is documented in Python doc . You can easily extend it to wider window.

Here's a generalization that adds support for `step`, `fillvalue` parameters:

``````from collections import deque
from itertools import islice

def sliding_window(iterable, size=2, step=1, fillvalue=None):
if size < 0 or step < 1:
raise ValueError
it = iter(iterable)
q = deque(islice(it, size), maxlen=size)
if not q:
return  # empty iterable or size == 0
q.extend(fillvalue for _ in range(size - len(q)))  # pad to size
while True:
yield iter(q)  # iter() to avoid accidental outside modifications
try:
q.append(next(it))
except StopIteration: # Python 3.5 pep 479 support
return
q.extend(next(it, fillvalue) for _ in range(step - 1))
``````

It yields in chunks `size` items at a time rolling `step` positions per iteration padding each chunk with `fillvalue` if necessary. Example for `size=4, step=3, fillvalue='*'`:

`````` [a b c d]e f g h i j k l m n o p q r s t u v w x y z
a b c[d e f g]h i j k l m n o p q r s t u v w x y z
a b c d e f[g h i j]k l m n o p q r s t u v w x y z
a b c d e f g h i[j k l m]n o p q r s t u v w x y z
a b c d e f g h i j k l[m n o p]q r s t u v w x y z
a b c d e f g h i j k l m n o[p q r s]t u v w x y z
a b c d e f g h i j k l m n o p q r[s t u v]w x y z
a b c d e f g h i j k l m n o p q r s t u[v w x y]z
a b c d e f g h i j k l m n o p q r s t u v w x[y z * *]
``````

For an example of use case for the `step` parameter, see Processing a large .txt file in python efficiently.

Just to show how you can combine `itertools` recipes, I'm extending the `pairwise` recipe as directly as possible back into the `window` recipe using the `consume` recipe:

``````def consume(iterator, n):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)

def window(iterable, n=2):
"s -> (s0, ...,s(n-1)), (s1, ...,sn), (s2, ..., s(n+1)), ..."
iters = tee(iterable, n)
# Could use enumerate(islice(iters, 1, None), 1) to avoid consume(it, 0), but that's
# slower for larger window sizes, while saving only small fixed "noop" cost
for i, it in enumerate(iters):
consume(it, i)
return zip(*iters)
``````

The `window` recipe is the same as for `pairwise`, it just replaces the single element "consume" on the second `tee`-ed iterator with progressively increasing consumes on `n - 1` iterators. Using `consume` instead of wrapping each iterator in `islice` is marginally faster (for sufficiently large iterables) since you only pay the `islice` wrapping overhead during the `consume` phase, not during the process of extracting each window-ed value (so it's bounded by `n`, not the number of items in `iterable`).

Performance-wise, compared to some other solutions, this is pretty good (and better than any of the other solutions I tested as it scales). Tested on Python 3.5.0, Linux x86-64, using `ipython` `%timeit` magic.

kindall's the `deque` solution, tweaked for performance/correctness by using `islice` instead of a home-rolled generator expression and testing the resulting length so it doesn't yield results when the iterable is shorter than the window, as well as passing the `maxlen` of the `deque` positionally instead of by keyword (makes a surprising difference for smaller inputs):

``````>>> %timeit -r5 deque(windowkindall(range(10), 3), 0)
100000 loops, best of 5: 1.87 µs per loop
>>> %timeit -r5 deque(windowkindall(range(1000), 3), 0)
10000 loops, best of 5: 72.6 µs per loop
>>> %timeit -r5 deque(windowkindall(range(1000), 30), 0)
1000 loops, best of 5: 71.6 µs per loop
``````

Same as previous adapted kindall solution, but with each `yield win` changed to `yield tuple(win)` so storing results from the generator works without all stored results really being a view of the most recent result (all other reasonable solutions are safe in this scenario), and adding `tuple=tuple` to the function definition to move use of `tuple` from the `B` in `LEGB` to the `L`:

``````>>> %timeit -r5 deque(windowkindalltupled(range(10), 3), 0)
100000 loops, best of 5: 3.05 µs per loop
>>> %timeit -r5 deque(windowkindalltupled(range(1000), 3), 0)
10000 loops, best of 5: 207 µs per loop
>>> %timeit -r5 deque(windowkindalltupled(range(1000), 30), 0)
1000 loops, best of 5: 348 µs per loop
``````

`consume`-based solution shown above:

``````>>> %timeit -r5 deque(windowconsume(range(10), 3), 0)
100000 loops, best of 5: 3.92 µs per loop
>>> %timeit -r5 deque(windowconsume(range(1000), 3), 0)
10000 loops, best of 5: 42.8 µs per loop
>>> %timeit -r5 deque(windowconsume(range(1000), 30), 0)
1000 loops, best of 5: 232 µs per loop
``````

Same as `consume`, but inlining `else` case of `consume` to avoid function call and `n is None` test to reduce runtime, particularly for small inputs where the setup overhead is a meaningful part of the work:

``````>>> %timeit -r5 deque(windowinlineconsume(range(10), 3), 0)
100000 loops, best of 5: 3.57 µs per loop
>>> %timeit -r5 deque(windowinlineconsume(range(1000), 3), 0)
10000 loops, best of 5: 40.9 µs per loop
>>> %timeit -r5 deque(windowinlineconsume(range(1000), 30), 0)
1000 loops, best of 5: 211 µs per loop
``````

(Side-note: A variant on `pairwise` that uses `tee` with the default argument of 2 repeatedly to make nested `tee` objects, so any given iterator is only advanced once, not independently consumed an increasing number of times, similar to MrDrFenner's answer is similar to non-inlined `consume` and slower than the inlined `consume` on all tests, so I've omitted it those results for brevity).

As you can see, if you don't care about the possibility of the caller needing to store results, my optimized version of kindall's solution wins most of the time, except in the "large iterable, small window size case" (where inlined `consume` wins); it degrades quickly as the iterable size increases, while not degrading at all as the window size increases (every other solution degrades more slowly for iterable size increases, but also degrades for window size increases). It can even be adapted for the "need tuples" case by wrapping in `map(tuple, ...)`, which runs ever so slightly slower than putting the tupling in the function, but it's trivial (takes 1-5% longer) and lets you keep the flexibility of running faster when you can tolerate repeatedly returning the same value.

If you need safety against returns being stored, inlined `consume` wins on all but the smallest input sizes (with non-inlined `consume` being slightly slower but scaling similarly). The `deque` & tupling based solution wins only for the smallest inputs, due to smaller setup costs, and the gain is small; it degrades badly as the iterable gets longer.

For the record, the adapted version of kindall's solution that `yield`s `tuple`s I used was:

``````def windowkindalltupled(iterable, n=2, tuple=tuple):
it = iter(iterable)
win = deque(islice(it, n), n)
if len(win) < n:
return
append = win.append
yield tuple(win)
for e in it:
append(e)
yield tuple(win)
``````

Drop the caching of `tuple` in the function definition line and the use of `tuple` in each `yield` to get the faster but less safe version.

I tested a few solutions and one I came up with and found the one I came up with to be the fastest so I thought I would share it.

``````import itertools
import sys

def windowed(l, stride):
return zip(*[itertools.islice(l, i, sys.maxsize) for i in range(stride)])
``````

``````def GetShiftingWindows(thelist, size):
return [ thelist[x:x+size] for x in range( len(thelist) - size + 1 ) ]

>> a = [1, 2, 3, 4, 5]
>> GetShiftingWindows(a, 3)
[ [1, 2, 3], [2, 3, 4], [3, 4, 5] ]
``````