Linked Structures

Agenda

  1. Motives
  2. Objectives
  3. Mechanisms

1. Motives

In [41]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from timeit import timeit

def time_array_front_insert_delete(n):
    return timeit('lst.insert(0, None) ; del lst[0]',
                  'lst = list(range({}))'.format(n),
                  number=1000)

ns = np.linspace(100, 10000, 50)
plt.plot(ns, [time_array_front_insert_delete(int(n)) for n in ns], 'ro')
plt.show()

Problem: arrays are not efficient for insertions & deletions (except for at the end).

... and these types of operations can be critical for structures where, say, some internal ordering must be preserved while adding elements (e.g., a database whose records are sorted/searchable by key)

In [46]:
# consider:

def concatenate(arr1, arr2):
    """Concatenates the contents of arr1 and arr2 as efficiently (time-wise)
    as possible, so that the resulting structure can be used to index all
    combined elements (arr1's followed by arr2's)."""

    # option 1:
    for x in arr2:
        arr1.append(x)
    return arr1

    # option 2:
    arr1.extend(arr2)
    return arr1

    # option 3:
    return arr1 + arr2

All options for concatenate are = $O(n)$, where $n$ is the number of elements involved.

Problem: all elements must be contiguous (in memory) in an array, as they are indexed from a common base address. No simple way to "bridge" the contents of two disjoint arrays --- i.e., no way to make arr1[len(arr1)] somehow "jump" to arr2[0].

... this makes it so separately allocated, large structures cannot be merged or uniformly indexed efficiently (without first performing an expensive "defragmentation" operation to physically move the structures next to each other).

2. Objectives

We need a new data storage mechanism for constructing data structures that:

  • does not require monolithic, contiguous memory allocation,
  • allows individual elements to be flexibly and efficiently reorganized,
  • and preserves the ability to locate (e.g., via position) and iterate over elements

3. Mechanisms

3.1. Two-Element Lists

In [54]:
# data items
i1 = 'lions'
i2 = 'tigers'
i3 = 'bears'
i4 = 'oh, my'
In [55]:
link1 = [i1, None]
link2 = [i2, None]
link3 = [i3, None]
link4 = [i4, None]
In [56]:
link1[1] = link2
link2[1] = link3
link3[1] = link4
In [57]:
def link_iter(head):
    while head:
        yield head[0]
        head = head[1]
In [58]:
for i in link_iter(link1):
    print(i)
lions
tigers
bears
oh, my
In [59]:
link0 = ['walruses', link1]
In [60]:
for i in link_iter(link0):
    print(i)
walruses
lions
tigers
bears
oh, my
In [61]:
link2[1] = ['goats', link2[1]]
In [62]:
for i in link_iter(link0):
    print(i)
walruses
lions
tigers
goats
bears
oh, my

3.2. "Link" objects

In [125]:
class Link:
    def __init__(self, val, next=None):
        self.val = val
        self.next = next
In [68]:
head = Link(i4)
head = Link(i3, head)
head = Link(i2, head)
head = Link(i1, head)
In [69]:
def linked_list_iter(lst):
    while lst:
        yield lst.val
        lst = lst.next
In [70]:
for i in linked_list_iter(linked_lst):
    print(i)
lions
tigers
bears
oh, my
In [77]:
# recursive implementation (note: recursive call obviates need for the loop)
def linked_list_iter(lst):
    if lst:
        yield lst.val
        yield from linked_list_iter(lst.next)
In [76]:
for i in linked_list_iter(linked_lst):
    print(i)
lions
tigers
bears
oh, my
In [91]:
class LinkedList:
    def __init__(self):
        self.head = None
        
    def prepend(self, val):
        self.head = Link(val, self.head)
        
    def __iter__(self):
        cursor = self.head
        while cursor:
            yield cursor.val
            cursor = cursor.next
            
    def __repr__(self):
        return '[' + ', '.join(str(x) for x in self) + ']'
In [94]:
lst = LinkedList()
for i in range(10):
    lst.prepend(i)
In [95]:
lst
Out[95]:
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
In [99]:
lst2 = LinkedList()
for i in range(10, 20):
    lst2.prepend(i)

p = lst2.head
while p.next: # advance p to last node in lst2
    p = p.next

p.next = lst.head
In [100]:
lst2
Out[100]:
[19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
In [101]:
class BinaryLink:
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right
In [106]:
root = BinaryLink(10)
root.left = BinaryLink(20)
root.left.left = BinaryLink(15)
root.right = BinaryLink(30)
root.right.right = BinaryLink(45)
In [107]:
def tree_iter(root):
    if root:
        yield root.val
        yield from tree_iter(root.left)
        yield from tree_iter(root.right)
In [108]:
for i in tree_iter(root):
    print(i)
10
20
15
30
45
In [119]:
class NaryLink:
    def __init__(self, val, n=2):
        self.val = val
        self.children = [None] * n
        
    def __getitem__(self, idx):
        return self.children[idx]
    
    def __setitem__(self, idx, val):
        self.children[idx] = val
        
    def __iter__(self):
        for c in self.children:
            yield c
In [124]:
root = NaryLink('Kingdoms', n=5)

root[0] = NaryLink('Animalia', n=35)
root[1] = NaryLink('Plantae', n=12)
root[2] = NaryLink('Fungi', n=7)
root[3] = NaryLink('Protista', n=5)
root[4] = NaryLink('Monera', n=5)

root[2][0] = NaryLink('Chytridiomycota')
root[2][1] = NaryLink('Blastocladiomycota')
root[2][2] = NaryLink('Glomeromycota')
root[2][3] = NaryLink('Ascomycota')
root[2][4] = NaryLink('Basidiomycota')
root[2][5] = NaryLink('Microsporidia')
root[2][6] = NaryLink('Neocallimastigomycota')

def tree_iter(root):
    if root:
        yield root.val
        for c in root:
            yield from tree_iter(c)
In [123]:
for x in tree_iter(root):
    print(x)
Kingdoms
Animalia
Plantae
Fungi
Chytridiomycota
Blastocladiomycota
Glomeromycota
Ascomycota
Basidiomycota
Microsporidia
Neocallimastigomycota
Protista
Monera