Trees
Definition
- A tree is a (possibly empty) set of nodes
- Nodes are connected by parent / child relationship
- If there are any nodes, one node is designated as the root
- Any children of the root are roots of subtrees
- Recursive data structure : a tree contains smaller trees
Terminology
- Node with children: internal node
- Node without children: leaf node
- All children of a given node are siblings of each other
- A path: a sequence of nodes N1, ..., Nk such that Ni is the parent of Ni+1, 0 < i < k
- Path N1, ..., Nk has length k-1
- If a path exists from N1 to N2, then N1 is an ancestor of N2, N2 is a descendant of N1
- Depth of a node: length of path from root to the node
Depth of a tree is the maximum depth of any node- Height of a node: length of longest path from the node to any leaf
Height of a tree is the height of the root
Representing the Tree Structure
- Each node contains data
- Must also represent relationships between nodes
- If we have an n-ary tree (each node has at most n children), each node could contain links to all its children (could be null)
- If we have an arbitrary tree, this is inefficient (list of child links?)
- Instead: First-Child, Next-Sibling representation
- All relationships can be represented via two pointers in each node
- Can add parent pointers for time efficiency
Tree Traversals
- Breadth-First
Visit all nodes at a given depth before any at a greater depth
First root, then children of root, then grandchildren of root, etc.
Can use a (single-ended) queue here
Enqueue root
while ( queue is not empty )
dequeue and visit node n
enqueue n's children
Could perform any operation when each node is visited- Depth-First
Visit all nodes on a ( rightmost? leftmost? ) path
Backtrack at path's end; find another way to finish path
Could use a stack here; we'll use call stack (recursion)
DFS ( root )
DFS ( node n )
for ( each child c of n )
DFS( c )
visit n
Could perform any operation when each node is visited- Runtime analysis
Regardless of the ordering, each node is visited exactly once
Constant amount of work done for each node (excluding any secondary action at visit)
enqueue or dequeue in constant time; make function call in constant time
Other Tree Functions
- CountNodes
can use BFS or DFS with counter parameter; O(n) time- Depth
can use DFS with a growing / shrinking depth, plus maximum depth- Height
can use BFS, enqueueing a partial height with each node
Binary Trees
- Each node has 0, 1, or 2 children
- Full binary tree: each node has 0 or 2 children
- Complete binary tree: full tree with bottom level completely filled; exactly 2height - 1 nodes
Binary Tree Traversals
- Postorder Traversal
Traverse on a node's ( left then right ) children before visiting the node
Just like DFS- Preorder Traversal
Visit a node, then traverse on its ( left then right ) children
Simple recursive algorithm- Inorder Traversal
Traverse on a node's left child, visit the node, then traverse on the node's right child
Simple recursive algorithm
Binary Tree Operations
- Create a new, empty tree
Set root pointer to null- Insert a value into a tree
If previously empty, new element becomes root; set new element's pointers correctly
If tree already had values, find an existing element that can add a child; adjust pointers
Simple: traverse leftmost path until such an element is found; leads to unbalanced trees
Complex: can ensure that each insertion doesn't unbalance the tree "too much"
Could pass in a pointer to the inserted node's new parent- Search for a value in a tree
If no scheme for ordering data in a tree, must traverse all nodes ( like sequential search )
Preorder, postorder, inorder; all equally efficient- Remove a value from a tree
Assume pointer to ( parent of? ) node to be removed
Removal creates a "hole" if node had children; ( recursively ) promote a child to fill the gap
Can also "promote" a leaf into the hole- Destroy a tree
Must destroy every node
Must destroy in depth-first order if no parent pointers
Otherwise, more freedom in order of destruction
Binary Search Trees
- Searching a binary tree is slow; no order to the values
like searching an unordered array- We can store smaller values on left, larger on right
like organizing an array from smallest to largest
inorder tree walk would give ordered listing of elements- Binary Search Tree: every node is no smaller than any value in its left subtree and no larger than any value in its right subtree
- Fast search -- akin to Binary Search on an ordered array
Start at root ( similar to middle of ordered array )
If that value is the key, then found
Else if that value is too large, Search in left subtree ( if any )
Else if that value is too small, Search in right subtree ( if any )
Base case: left or right subtree does not exist ( similar to no more elements in subarray )- Insertion -- find where node would go, insert it there
Proceed as if searching for the new value
If found, make it a child of that node ( either side )
If not found, make it a child of the last node visited during the search- Removal -- we have to be careful to fill the "holes" correctly
Now promote either predecessor or successor to fill the gap
Makes sure that each node still correctly fulfills defining inequalities
Easy to clean up after promoted predecessor / successor- Creating and destroying a tree -- same as for arbitrary binary trees