Welcome to Lesson 11!

Learning Objectives
By the end of today's class, you should know...
  • How is a BST different from a Binary Tree?
  • Why is the BST a recursive structure? 
  • What are the BST operations?
  • What is the Big-O run-time of these operations on a BST?
  • How do you implement insertion for a BST
  • How to trace recursive tree traversals on a BST:
    • inOrder
    • preOrder
    • postOrder
  • How do you implement the recursive BST operations:
    • search
    • minimum
    • removeData

Announcements

  • Quiz 4 one week from today
  • Lab 6 due Tuesday
    • Who needs a partner for the extra credit pair programming? Please see me at break or send me an email.
  • Submit Project Report 2


Review Activity

  • Answer the following questions about the below tree:
    • What is the root of the tree?
    • What are the leaves of the tree?
    • What is the depth of the root and what is the height of the root?
    • What is the depth and height of peach?


Computer Science Learners: Binary Search Tree

  • What is the difference between a binary tree and a binary search tree?

The Binary Search Tree ADT
  • A Binary Search Tree (BST) is a binary tree for which the following is true: for any node in the BST, the value of all nodes in its left subtree are less than or equal to the value of the node, and the value at all nodes in the right subtree are greater than the value of the node.
  • The following tree is a BST. Verify that the above property is true for this tree.
BST: A Recursive Structure
  • Remember that the root node is linked to the root nodes of its left and right subtrees.
  • These root nodes are in turned linked to the root nodes of other subtrees.
  • BST are recursive because the BST property must hold true for all nodes in the tree and their subtrees.

Image source.


  • Examine the tree above. Do all subtrees meet the requirements for a BST?
  • Due to the recursive definition of BSTs, we are able to implement most BST operations recursively, as we shall see in a moment.


BST Operations

Insertion

  • As always, the best functions to write first are your default constructor, an insertion function and a print function.
  • The functions you will be writing for the BST will take advantage of the recursive structure of the BST.
  • To get an idea of how to write the code for this function, let's look at an example
  • Imagine that we are trying to insert 4 into our BST as in the image below.
  • At each root node of a subtree, we need to make a decision. Do we go left? Do we go right? Or if the root node is null, we should insert where we are.

image

Image source.


  • Thus, our process is recursive as we are making the same decision at each node.

Group Activity:

  • With a partner, insert the following values into a BST in the given order: 5 3 6 9 0 2 7 12 4
  • How does the tree change if you swap the order of 5 and 9?  9 3 6 5 0 2 7 12 4
  •  What about 5 and 12? 12 3 6 9 0 2 7 5 4

Insertion Pseudocode

  • Our BST users do not have direct access to any of the nodes in the BST or their data or links.
  • Therefore, we need to call a private helper function inside of insert and pass the root node to this function.
  • The private helper function is the recursive function that is going to do most of the work here.
  • The private helper function takes the root and a value as parameters.
  • The public wrapper function simply takes the value to insert as a parameter
  • Here is the code for the public wrapper function:
template <class bstdata>
void BST<bstdata>::insert(bstdata data)
{
    if (root == NULL)
    {
        root = new Node(data); //if the tree is empty insert the value at the root
    }
    else
    {
        insertNode(root, data); //otherwise call the helper function, passing it the root
    }
}
  • Here is the pseudocode for the private helper function:

Pseudocode for insertNode(root, value)

1. Base Case: the value to insert is equal to the data stored at the root

If this is true, return //no duplicates

2. Check if the value to insert is smaller than the data stored at the root.

2a. If this is true

2aa. Check if root's leftchild is NULL

2aaa. If this is true, insert the value at root's leftchild

2b. If the root's leftchild is not NULL

2bb.recursively call the insertNode function passing it root's leftchild and value

3. Otherwise

3a. Check if the root's rightchild is NULL

3aa. If this is true, insert the value at root's rightchild

3b. If the root's rightchild is not NULL

3bb. recursively call the insertNode function passing it root's rightchild and value


Analyzing the Efficiency of BST Operations


Introduction to Run-time Analysis on Binary Search Tree Operations

  • The Big-O runtime of the binary tree operations is closely allied to the height of the tree.
  • The closer the tree is to a perfect binary tree, the better the run-time.
  • Why is this?
  • Consider a binary tree like this one:



  • This binary tree is no different than a linked list.
  • Is search efficient on a linked list? We should have seen in Lab 4 that searching a linked list is not efficient.
  • Because linked lists do not offer random access of their elements, binary search is not an efficient algorithm on a linked list.
  • Did you notice this inefficiency while completing your Lab 4?
    • Thus, linear search is a better choice for linked lists.
    • What is the Big-O run-time of linear search on a linked list: O(n).
  • For binary trees organized like the one above, search also takes O(n) time, as does insertion and removal.
  • We are going to see with balanced Binary Search Trees that search is possible in O (log n) time, just like binary search on an array.
  • However, to keep search (and other operations) efficient, we want to keep our trees as balanced as possible.
  • This idea should make intuitive sense to you.

Comparison of Arrays, Linked Lists and Binary Search Trees
  • Let's compare the average case Big-O run-times of Arrays and Linked Lists to those of the BST when performing 3 common ADT operations.
Array            Linked List            Binary Search Tree            
Search                O(log n)                O(n)                    O(log n)
Insertion             O(n)                     O(n)                    O(log n)
Deletion              O(n)                     O(n)                    O(log n)

  • Why are these operations O(log n) for a BST?
  • Consider search. In the image below, we are searching for the value 4 to determine if it is in the tree:

image

Image source.

  • As in binary search, we continually divide the search space in half until it is only one element:
n -> n / 2 -> n / 4 -> n/ 8... -> n/ 2k = 1
  • Multiply both sides by 2k
n = 2k
  • Then take the log of both sides:
log (n) = k
  • This leaves us with O(log n).
  • Let's walk through an example to verify this works on a balanced BST. What happens if we search for the value 12?
  • Insertion and removal have the same run-time on average as search. Let's walk through an example of inserting the value 9 in the BST above.
  • But, what about when the BST is unbalanced?
  • The Big-O run-time in this case will be equal to that of a linked list: O(n)
  • Why?


Animated BST Efficiency compared to a Linked List (although it says array)




BST Operations Continued

Tree Traversals
  • Tree traversals define an ordering for which we can "visit" each node in the tree to perform operations such as printing.
  • In your Lab, you will be asked to write preOrderPrint, inOrderPrint and postOrderPrint, which each defined a means of printing our BSTs in different orders.
  • These recursive functions are simple to write.
  • Their names come from where in the algorithm the root is visted.
    • Is it visited before the recursive calls (preorder)?
    • Is it visited after the recursive calls (postorder)?
    • Is it visited between the recursive calls (inorder)?
  • What do you think the Big-O runtime of these functions is?
    • O(n) Why?
  • Let's get an introduction to this topic by watching this video
  • Here is the pseudocode for each of the functions.
  • Note that these functions would be the "helper" versions in our implementation

Pseudocode for inOrderPrint(root)
1. If the root is not NULL
    a. Call inOrderPrint recursively on the root leftchild
    b. Print the data at the root
    c. Call inOrderPrint recursively on the root rightchild

Pseudocode for preOrderPrint(root)
1. If the root is not NULL
    a. Print the data at the root
    b. Call preOrderPrint recursively on the root leftchild
    c. Call preOrderPrint recursively on the root rightchild


Pseudocode for postOrderPrint(root)
1. If the root is not NULL
    a. Call postOrderPrint recursively on the root leftchild
    b. Call postOrderPrint recursively on the root rightchild
    c. Print the data at the root

  • Let's trace the printing order of the three algorithms on the same binary tree

  • inOrderPrint: 
Inorder Traversal in Binary Search Tree


  • preOrderPrint: 
Preorder Traversal in Binary Search Tree

  • postOrderPrint: 

Postorder Traversal in Binary Search Tree


Activity: Tree Traversals
  • Find a partner and a piece of paper
  • Using the examples above, trace inOrderPrint, preOrderPrint and postOrderPrint on the BST below.




  • Now, write the 3 recursive print functions, along with their helper functions.


Searching a BST
  • Searching a BST is an almost identical process to adding a new node to the tree.
  • Let's trace an example, using the same example as for insertion.
  • The Pseudocode is almost identical to insert as well.
  • However, search returns a boolean instead of being a void function

Pseudocode for searchNode(root, value)

1. Base Case: the value is equal to the data stored at the root

1a. return true

2. Check if the value is smaller than the data stored at the root.

2a. If this is true

2aa. If root's leftchild is NULL

2aaa. return false. The value was not found

2b. If the root's leftchild is not NULL

2bb. recursively call the search function passing it root's leftchild and value

3. Otherwise

3a. Check if the root's rightchild is NULL

3aa. If this is true, return false. The value was not found.

3b. If the root's rightchild is not NULL

3bb. recursively call the search function passing it root's rightchild and value

4. return false //optional depending on compiler


Activity: Writing the BST Search Function search

  • Here is the code for the public search function
template <class bstdata>
bool BST<bstdata>::search(bstdata data) const
{
//Add code to check for the precondition here
if (data == root->data)
        return true;
else
return searchNode(root, data); //call helper function

}
  • It is your job to write the helper function search
  • Test your function inside of BSTTest.cpp to make sure it is working properly.

Finding the Minimum Value in a BST
  • The purpose of the minimum function is to find the minimum value in the tree.
  • Where is the minimum value always located in a binary search tree?
    • If you are uncertain, look at an example of BST to see where the minimum is located.
  • To find the minimum, all we need to do is to scroll through the nodes in the tree to the proper location.
Pseudocode for private minimum (Node* root)
  • If the root's left node is not Null
    • return recursive call to minimum, passing it root's leftchild
  • Otherwise
    • Return the data at the root


BST remove function
  • Remove is perhaps the most challenging BST algorithm to implement.
  • Like search, add, and the tree traversals it is also recursive.
  • Remove works by first searching for a particular value in a BST and then freeing the memory for this Node.
  • However, complication comes from the fact that we could encounter several different cases:
    • Case 1: The node we wish to remove if a leaf node <-- easy case
    • Case 2: The node we wish to remove has only one child <-- medium case
    • Case 3: The node we wish to remove has two children <-- hard case
  • When deleting nodes in each of these cases, you must be mindful to preserve the BST property.
  • Let's look at each of these possibilities in the diagram below:
Deletion rules


  • Therefore, when writing remove, we need to handle all three cases.
  • The first 3 steps in the pseudocode below locate the value to remove inside the tree
  • Step 4 handles the three cases discussed above.
Pseudocode for removeNode(root, value)
  1. If root is null
    1. return root
  2. Otherwise, if value < the data at the root
    1. set root left equal to the recursive call of removeNode on root left
  3. Otherwise, if value > the data at the root
    1. set root right equal to the recursive call of removeNode on root right
  4. Otherwise,
    1. If root is a leaf node
      1. delete root 
      2. Set root to NULL
    2. Otherwise if root has a left child but no right child
      1. set the left child to be the root
      2. call delete on the root
    3. Otherwise if root has a right child but no left child
      1. set the right child to be the root
      2. call delete on the root
    4. Otherwise
      1. Search for the minimum value in the right subtree of the root
      2. Set the data of the root to be the data of the minimum value of the root right
      3. Set root right equal to the recursive call of removeNode, passing it root right and the minimum data of root right (i.e. delete the duplicate value in the right subtree)
  5. return the root

Activity: Writing BST removeNode

  • Note 1: You can use the minimum(root) function that you wrote above as an additional helper function.
  • Note 2: When you implement the deleteData helper function, you will need to alter it's signature outside of the class definition to be:
template <class bstdata>
typename BST<bstdata>::Node* BST<bstdata>::removeNode(Node* root, bstdata data)
{
    //implement function here
   
}
  • Otherwise, your compiler will not recognize what a Node* is being accesses outside of the BST class definition.
  • Test your function inside of BSTTest.cpp to make sure it is working properly.


Wrap Up
  • With your partner, answer the questions from today's learning objectives


Upcoming Assignments
  • Lab 6 due Tuesday