Implementing Iteration

Agenda

  1. Review: Iteration
  2. Details: iterables, iterators, iter, and next
  3. Implementing iterators with classes
  4. Implementing iterators with generators and yield

1. Review: Iteration

Iteration simply refers to the process of accessing — one by one — the items stored in some container. The order of the items, and whether or not the iteration is comprehensive, depends on the container.

In Python, we typically perform iteration using the for loop.

In [1]:
# e.g., iterating over a list
l = [2**x for x in range(10)]
for n in l:
    print(n)
1
2
4
8
16
32
64
128
256
512
In [2]:
# e.g., iterating over the key-value pairs in a dictionary
d = {x:2**x for x in range(10)}
for k,v in d.items():
    print(k, '=>', v)
0 => 1
1 => 2
2 => 4
3 => 8
4 => 16
5 => 32
6 => 64
7 => 128
8 => 256
9 => 512

2. Details: iterables, iterators, iter, and next

We can iterate over anything that is iterable. Intuitively, if something can be used as the source of items in a for loop, it is iterable.

But how does a for loop really work?

In [6]:
# by default, only the result of the last expression in a cell is displayed after evaluation.
# the following forces display of *all* self-standing expressions in a cell.

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
In [7]:
l = [2**x for x in range(10)]
l
it = l.__iter__()
it
Out[7]:
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]
Out[7]:
<list_iterator at 0x59765f8>
In [18]:
it.__next__()
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-18-a1cc91b7bc73> in <module>()
----> 1 it.__next__()

StopIteration: 
In [1]:
l = [2**x for x in range(10)]
l
it = l.__iter__()
#for i in l:
#    print(i)
for i in it:
    print(i)
1
2
4
8
16
32
64
128
256
512

3. Implementing iterators with classes

In [2]:
class MyIterator:
    def __init__(self, max):
        self.max = max
        self.curr = 0
        
    # the following methods are required for iterator objects
    
    def __next__(self):
        if self.curr<self.max:
            self.curr+=1
            return self.curr
        else:
            raise StopIteration
    
    def __iter__(self):
        return self
In [3]:
it = MyIterator(5)
In [9]:
next(it)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-9-2cdb14c0d4d6> in <module>()
----> 1 next(it)

<ipython-input-2-09d2a38a5bd5> in __next__(self)
     11             return self.curr
     12         else:
---> 13             raise StopIteration
     14 
     15     def __iter__(self):

StopIteration: 
In [10]:
it = MyIterator(5)
while True:
    try:
        print(next(it))
    except StopIteration:
        break
1
2
3
4
5
In [11]:
it = MyIterator(5)
for i in it:
    print(i)
1
2
3
4
5

For a container type, we need to implement an __iter__ method that returns an iterator.

In [12]:
class ArrayList:
    def __init__(self):
        self.data = []
        
    def append(self, val):
        self.data.append(None)
        self.data[len(self.data)-1] = val
        
    def __iter__(self):
        class ArrayListIterator:
            def __init__(self,lst):
                self.idx=0
                self.lst=lst
        
            def __next__(self):
                if self.idx<len(self.lst):
                    self.idx+=1
                    return self.lst[self.idx-1]
                else:
                    raise StopIteration
    
            def __iter__(self):
                return self
        
        return ArrayListIterator(self.data) 
    
    
In [13]:
l = ArrayList()
for x in range(10):
    l.append(2**x)
In [14]:
it = iter(l)
In [15]:
type(it)
Out[15]:
__main__.ArrayList.__iter__.<locals>.ArrayListIterator
In [18]:
next(it)
Out[18]:
4
In [19]:
for x in l:
    print(x)
1
2
4
8
16
32
64
128
256
512

4. Implementing iterators with generators and yield

What's a "generator"?

In [25]:
g = (2**x for x in range(10))
In [21]:
g
Out[21]:
<generator object <genexpr> at 0x0000000006DE3678>
In [22]:
len(g)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-baf0f4fee012> in <module>()
----> 1 len(g)

TypeError: object of type 'generator' has no len()
In [29]:
for x in g:
    print(x)
In [28]:
g
Out[28]:
<generator object <genexpr> at 0x0000000006BB5F10>
In [32]:
%timeit -n 10000 [x for x in range(10**3)]
40.4 µs ± 2.67 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [31]:
%timeit -n 10000 (x for x in range(10**10))
1.18 µs ± 152 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [33]:
%timeit 1024 in [2**x for x in range(100)]
57.3 µs ± 3.34 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [34]:
%timeit 1024 in (2**x for x in range(100))
5.62 µs ± 73.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [43]:
def foo():
    print('foo')
    yield 1
    yield 3
    yield -1
In [44]:
g=foo()
In [48]:
next(g)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-48-5f315c5de15b> in <module>()
----> 1 next(g)

StopIteration: 
In [55]:
class ArrayList:
    def __init__(self):
        self.data = []
        
    def append(self, val):
        self.data.append(None)
        self.data[len(self.data)-1] = val
        
    def __iter__(self):
#        for i in range(len(self.data)):
#            yield self.data[i] 
        for item in self.data:
            yield item
    
In [56]:
l = ArrayList()
for x in range(10):
    l.append(2**x)
In [58]:
for i in l:
    print(i)
1
2
4
8
16
32
64
128
256
512
In [59]:
it = iter(l)
it
Out[59]:
<generator object ArrayList.__iter__ at 0x0000000006DE3BF8>
In [61]:
for x in it:
    print(x)
1
2
4
8
16
32
64
128
256
512