Welcome to Lesson 11!

Learning Objectives
By the end of today's class, you should know...
  • What is a binary tree?
  • How is a BST different from a Binary Tree?
  • Why is the BST a recursive structure? 
  • What are the BST operations?
  • What is the Big-O run-time of these operations on a BST?
  • How do you implement insertion for a BST
  • How to trace recursive tree traversals on a BST:
    • inOrder
    • preOrder
    • postOrder
  • How do you implement the recursive BST operations:
    • search
    • minimum
    • remove

Announcements

  • Quiz 4 next class - questions on trees, binary trees, BSTs only
  • Lab 6 due Monday
  • Submit Project Report 2 next class
  • Return midterms in last 5 minutes of class - excellent results!
    • 28 As
    • 7 Bs
    • 3 Cs
    • 3 Ds
    • 1 F


Review Activity

  • Answer the following questions about the below tree:
    • What is the root of the tree?
    • What are the leaves of the tree?
    • What is the depth of the root and what is the height of the root?
    • What is the depth and height of peach?


Computer Science Learners: Binary Search Tree


The Binary Tree ADT
  • A binary tree is a tree where each node can have at most 2 children.
  • Below is an example of a binary tree. Note that each node has 0, 1 or 2 children.

Image source
  • These two children are called the left child and the right child.
  • For example, in the BT above, 4 is the left child of its parent node 6 and 7 is the right child of its parent node 6.
  • 14 only has a left child, but does not have a right child.
  • Here is another example of a valid binary tree. Why does this tree meet the definition of a binary tree?

  • Is the below also a valid binary tree? What does it resemble?



 
Image source.
  • We can consider binary trees to be recursive structures.
    • The root node links to root nodes of its left and right subtrees.
    • These roots of the subtrees in turn link to the root nodes of their left and right subtrees.
    • We will discuss this concept more when we get to Binary Search Trees.

    

Image source.

  • Therefore, each node in the tree will contain some data as well as two pointers, one to the left child and one to the right child as depicted below:
Binary Tree
  • If a node has no right or left child, these values are set to NULL.
  • As you might have predicted, any implementation of a binary tree, we will need to define a private inner Node struct like this:
struct Node
{
    int data;
    Node* left;
    Node* right;
};
  • Therefore, a Node has three fields:
    • a field for the data
    • a field for the address of the left child (another Node).
    • a field for the address of the right child (another Node).



  • We maintain access to a special Node pointer - the root - by storing it as a private field in the class.
class BinaryTree {
private:
    struct Node {
        bstdata data;
        Node* left;
        Node* right;

        Node(bstdata newdata){
            data = newdata;
            left = NULL;
            right = NULL;
        }
    };

    Node* root;
};




Special Types of Binary Trees
  • A binary tree is considered a proper or strict binary tree if each node has either 0 or 2 children. Below is an example of a strict binary tree.
  • complete binary tree is one in which all levels are filled (except the last) and all nodes are as far left as possible. Below is an example of a complete binary tree.

  • A perfect or full binary tree is one in which all levels are completely filled.



The Binary Search Tree ADT
  • A Binary Search Tree (BST) is a binary tree for which the following is true: for any node in the BST, the value of all nodes in its left subtree are less than or equal to the value of the node, and the value at all nodes in the right subtree are greater than the value of the node.
  • The following tree is a BST. Verify that the above property is true for this tree.
BST: A Recursive Structure
  • Remember that the root node is linked to the root nodes of its left and right subtrees.
  • These root nodes are in turned linked to the root nodes of other subtrees.
  • BST are recursive because the BST property must hold true for all nodes in the tree and their subtrees.

Image source.


  • Examine the tree above. Do all subtrees meet the requirements for a BST?
  • Due to the recursive definition of BSTs, we are able to implement most BST operations recursively.


BST Header

#ifndef BST_H_
#define BST_H_

#include <cstddef>
#include <string>
#include <assert.h>

using namespace std;
template<class bstdata>
class BST {
private:
    struct Node {
        bstdata data;
        Node* left;
        Node* right;

        Node(bstdata newdata){
            data = newdata;
            left = NULL;
            right = NULL;
        }
    };

    Node* root;

    /**Private helper functions*/
    void insertHelper(Node* root, bstdata data);
    //private helper function for insert
    //recursively inserts a value into the BST

    void inOrderPrintHelper(ostream& out, Node* root) const;
    //private helper function for inOrderPrint
    //recursively prints tree values in order from smallest to largest

    void preOrderPrintHelper(ostream& out, Node* root) const;
    //private helper function for preOrderPrint
    //recursively prints tree values in pre order

    void postOrderPrintHelper(ostream& out, Node* root) const;
    //private helper function for postOrderPrint
    //recursively prints tree values in post order

    void copyHelper(Node* copy);
    //recursive helper function to the copy constructor

    void destructorHelper(Node* root);
    //private helper function for the destructor
    //recursively frees the memory in the BST

    bool searchHelper(Node* root, bstdata data) const;
    //recursive helper function to search
    //returns whether the value is found in the tree

    bstdata minimumHelper(Node* root) const;
    //recursive helper function to minimum
    //returns the minimum value in the tree

    bstdata maximumHelper(Node* root) const;
    //recursive helper function to maximum
    //returns the maximum value in the tree

    Node* removeHelper(Node* root, bstdata data);
    //recursive helper function to remove
    //removes data from the tree

    void getSizeHelper(Node* root, int& size) const;
    //recursive helper function to the size function
    //calculates the size of the tree
    //stores the result in size

    int getHeightHelper(Node* root) const;
    //recursive helper function to the height function
    //returns the height of the tree

public:

    /**constructors and destructor*/

    BST();
    //Instantiates a new BST

    BST(const BST &bst);
    //copy constructor

    ~BST();
    //deallocates the tree memory

    /**access functions*/

    bstdata minimum() const;
    //returns the minimum value in the BST
    //pre: !empty()

    bstdata maximum() const;
    //returns the maximum value in the BST
    //pre: !empty()

    bool isEmpty() const;
    //determines whether the BST is empty

    int getSize() const;
    //returns the size of the tree

    bstdata getRoot() const;
    //returns the value stored at the root of the BST
    //pre: !isEmpty()

    int getHeight() const;
    //returns the height of the tree

    bool search(bstdata data) const;
    //returns whether the data is found in the tree
    //pre: !isEmpty()

    /**manipulation procedures*/

    void insert(bstdata data);
    //inserts new data into the BST

    void remove(bstdata data);
    //removes the specified value from the tree
    //pre: data is located in the tree
    //pre: !isEmpty()

    /**additional functions*/

    void inOrderPrint(ostream& out) const;
    //calls the inOrderPrint function to print out the values
    //stored in the BST

    void preOrderPrint(ostream& out) const;
    //calls the reOrderPrint function to print out the values
    //stored in the BST

    void postOrderPrint(ostream& out) const;
    //calls the postOrderPrint function to print out the values
    //stored in the BST
};
#endif /* BST_H_ */


BST Operations

Insertion

  •  The BST operations take advantage of the recursive nature of the BST.
  • To get a better understanding of the insertion process for a BST, let's look at an example.
  • Imagine that we are trying to insert 4 into our BST as in the image below.
  • At each root node of a subtree, we need to make a decision. Do we go left? Do we go right? Or if the root node is null, we should insert where we are.

image

Image source.


  • Thus, our process is recursive, as we are making the same decision at each node (i.e. root of a subtree).


Group Activity:

  • With a partner, insert the following values into a BST in the given order: 5 3 6 9 0 2 7 12 4
  • How does the tree change if you swap the order of 5 and 9?  9 3 6 5 0 2 7 12 4
  •  What about 5 and 12? 12 3 6 9 0 2 7 5 4



Insertion Pseudocode
  • Our BST users do not have direct access to any of the nodes in the BST or their data or links.
  • Therefore, we need to call a private helper function inside of insert and pass the root node to this function.
  • The private helper function is the recursive function that is going to do most of the work.
  • The public wrapper function simply takes the value to insert as a parameter. Its primary purpose is to call the helper function.
  • Here is the code for the public wrapper function:
template <class bstdata>
void BST<bstdata>::insert(bstdata data)
{
    if (root == NULL)
    {
        root = new Node(data); //if the tree is empty insert the value at the root
    }
    else
    {
        insertHelper(root, data); //otherwise call the helper function, passing it the root
    }
}
  • Here is the pseudocode for the private helper function:

Pseudocode for insertHelper(root, value)

1. Base Case: the value to insert is equal to the data stored at the root

If this is true, return //no duplicates

2. Check if the value to insert is smaller than the data stored at the root.

2a. If this is true

2aa. Check if root's leftchild is NULL

2aaa. If this is true, insert the value at root's leftchild

2b. If the root's leftchild is not NULL

2bb.recursively call the insertHelper function passing it root's leftchild and value

3. Otherwise

3a. Check if the root's rightchild is NULL

3aa. If this is true, insert the value at root's rightchild

3b. If the root's rightchild is not NULL

3bb. recursively call the insertHelper function passing it root's rightchild and value


Analyzing the Efficiency of BST Operations


Introduction to Run-time Analysis on Binary Search Tree Operations

  • The Big-O runtime of the binary tree operations is closely allied to the height of the tree.
  • The closer the tree is to a perfect binary tree, the better the run-time.
  • Why is this?
  • Consider a binary tree like this one:



  • This binary tree is no different than a linked list.
  • Is search efficient on a linked list? We should have seen in Lab 4 that searching a linked list is not efficient.
  • Because linked lists do not offer random access of their elements, binary search is not an efficient algorithm for a linked list.
  • For binary trees organized like the one above, search also takes O(n) time, as does insertion and removal.
  • We are going to see with a more balanced Binary Search Trees, operations such as search, insert and remove are possible in O (log n) time, just like binary search on an array.
  • However, to keep search (and other operations) efficient, we want to keep our trees as balanced as possible.
  • What does it mean to be balanced?


Balanced Binary Search Tree

  • A balanced binary tree is one in which the difference between the height of the left and right subtrees is not more than 1.
| heightleft - heightright| <= 1 
  • Let's give this a try.
  • The below tree is balanced even though it might not appear so. Why?

An example binary tree


  • Consider the height of the right and left subtree of the root node A.
  • hleft = 1
  • hright = 2
  • |1 - 2| = 1 <--balanced!
  • Let's check another node, B.
  • hleft = -1 (height of an empty tree is -1)
  • hright = 0
  • |-1 - 0| = 1 <--balanced!


Partner Activity
  • With a partner answer the following question: Is the binary tree depicted below balanced? Why or why not?



Comparison of Arrays, Linked Lists and Binary Search Trees

  • Let's compare the average case Big-O run-times of Arrays and Linked Lists to those of the BST when performing 3 common ADT operations.
Array            Linked List            Binary Search Tree            
Search                O(log n)                O(n)                    O(log n)
Insertion             O(n)                     O(n)                    O(log n)
Deletion              O(n)                     O(n)                    O(log n)

  • Why are these operations O(log n) for a BST?
  • Consider search. In the image below, we are searching for the value 4 to determine if it is in the tree:

image

Image source.

  • As in binary search, we continually divide the search space in half until there is only one element:
n -> n / 2 -> n / 4 -> n/ 8... -> n/ 2k = 1
  • Multiply both sides by 2k
n = 2k
  • Then take the log of both sides:
log (n) = k
  • This leaves us with O(log n).
  • Let's walk through an example to verify this works on a balanced BST. What happens if we search for the value 12?
  • Insertion and removal have the same average runtime as search, as both take advantage of the organization of data within the tree.
  • Let's walk through an example of inserting the value 9 in the BST above.
  • But, what about when the BST is unbalanced?
  • The Big-O run-time in this case will be equal to that of a linked list: O(n)
  • Why?


Animated BST Efficiency compared to a Linked List (although it says array)




BST Operations Continued

Tree Traversals
  • Tree traversals define an ordering for which we can "visit" each node in the tree to perform operations such as printing.
  • In your Lab, you will be asked to write preOrderPrint, inOrderPrint and postOrderPrint, which each defined a means of printing our BSTs in different orders.
  • These recursive functions are simple to write.
  • Their names come from where in the algorithm the root is visted.
    • Is it visited before the recursive calls (preorder)?
    • Is it visited after the recursive calls (postorder)?
    • Is it visited between the recursive calls (inorder)?
  • What do you think the Big-O runtime of these functions is?
    • O(n) Why?
  • Let's get an introduction to this topic by watching this video
  • Here is the pseudocode for each of the functions.
  • Note that these functions would be the "helper" versions in our implementation

Pseudocode for inOrderPrint(root)
1. If the root is NULL
    a. return;
2. Otherwise
    a. Call inOrderPrint recursively on the root leftchild
    b. Print the data at the root
    c. Call inOrderPrint recursively on the root rightchild

Pseudocode for preOrderPrint(root)
1. If the root is NULL
    a. return
2. Otherwise
    a. Print the data at the root
    b. Call preOrderPrint recursively on the root leftchild
    c. Call preOrderPrint recursively on the root rightchild


Pseudocode for postOrderPrint(root)
1. If the root is NULL
    a. return
2. Otherwise
    a. Call postOrderPrint recursively on the root leftchild
    b. Call postOrderPrint recursively on the root rightchild
    c. Print the data at the root

  • Let's trace the printing order of the three algorithms on the same binary tree

  • inOrderPrint: 
Inorder Traversal in Binary Search Tree


  • preOrderPrint: 
Preorder Traversal in Binary Search Tree

  • postOrderPrint: 

Postorder Traversal in Binary Search Tree


Activity: Tree Traversals
  • Find a partner and a piece of paper
  • Using the examples above, trace inOrderPrint, preOrderPrint and postOrderPrint on the BST below.




  • Now, write the 3 recursive print functions, along with their helper functions.


Searching a BST
  • Searching a BST is an almost identical process to adding a new node to the tree.
  • Let's trace an example, using the same example as for insertion.
  • The Pseudocode is almost identical to insert as well.
  • However, search returns a boolean instead of being a void function

Pseudocode for searchHelper(root, value)

1. Base Case: the value is equal to the data stored at the root

1a. return true

2. Check if the value is smaller than the data stored at the root.

2a. If this is true

2aa. If root's leftchild is NULL

2aaa. return false. The value was not found

2b. If the root's leftchild is not NULL

2bb. recursively call the search helper function passing it root's leftchild and value

3. Otherwise

3a. Check if the root's rightchild is NULL

3aa. If this is true, return false. The value was not found.

3b. If the root's rightchild is not NULL

3bb. recursively call the search helper function passing it root's rightchild and value

4. return false //optional depending on compiler


Activity: Writing the BST Search Function search

  • Here is the code for the public search function
template <class bstdata>
bool BST<bstdata>::search(bstdata data) const
{
//Add code to check for the precondition here
if (data == root->data)
        return true;
else
return searchHelper(root, data); //call helper function

}
  • It is your job to write the helper function search
  • Test your function inside of BSTTest.cpp to make sure it is working properly.

Finding the Minimum Value in a BST
  • The purpose of the minimum function is to find the minimum value in the tree.
  • Where is the minimum value always located in a binary search tree?
    • If you are uncertain, look at an example of BST to see where the minimum is located.
  • To find the minimum, all we need to do is to scroll through the nodes in the tree to the proper location.
Pseudocode for private minimumHelper (Node* root)
  • If the root's leftchild is not Null
    • return recursive call to minimum, passing it root's leftchild
  • Otherwise
    • Return the data at the root


BST remove function
  • Remove is perhaps the most challenging BST algorithm to implement.
  • Like search, add, and the tree traversals it is also recursive.
  • Remove works by first searching for a particular value in a BST and then freeing the memory for this Node.
  • However, complication comes from the fact that we could encounter several different cases:
    • Case 1: The node we wish to remove if a leaf node <-- easy case
    • Case 2: The node we wish to remove has only one child <-- medium case
    • Case 3: The node we wish to remove has two children <-- hard case
  • When deleting nodes in each of these cases, you must be mindful to preserve the BST property.
  • Let's look at each of these possibilities in the diagram below:
Deletion rules


  • Therefore, when writing remove, we need to handle all three cases.
  • The first 3 steps in the pseudocode below locate the value to remove inside the tree
  • Step 4 handles the three cases discussed above.
Pseudocode for removeHelper(root, value)
  1. If root is null
    1. return root
  2. Otherwise, if value < the data at the root
    1. set root left equal to the recursive call of removeHelper on root left
  3. Otherwise, if value > the data at the root
    1. set root right equal to the recursive call of removeHelper on root right
  4. Otherwise,
    1. If root is a leaf node
      1. delete root 
      2. Set root to NULL
    2. Otherwise if root has a left but no right
      1. set the left to be the root
      2. call delete on the root
    3. Otherwise if root has a right but no left
      1. set the right to be the root
      2. call delete on the root
    4. Otherwise
      1. Search for the minimum value in the right subtree of the root
      2. Set the data of the root to be the data of the minimum value of the root's right subtree
      3. Set root right equal to the recursive call of removeHelper, passing it root right and the minimum data of root right (i.e. delete the duplicate value in the right subtree)
  5. return the root

Activity: Writing BST removeNode

  • Note 1: You can use the minimum(root) function that you wrote above as an additional helper function.
  • Note 2: When you implement the removeHelper function, you will need to alter it's signature outside of the class definition to be:
template <class bstdata>
typename BST<bstdata>::Node* BST<bstdata>::removeHelper(Node* root, bstdata data)
{
    //implement function here
   
}
  • Otherwise, your compiler will not recognize the Node*, as it referenced before the compiler reaches the part of the function signature specifying that this function is from the BST class.
  • Test your function inside of BSTTest.cpp to make sure it is working properly.


Wrap Up
  • With your partner, answer the questions from today's learning objectives


Upcoming Assignments
  • Lab 6 due Monday
  • Project Report 2 due Wednesday
  • Quiz 4 on Wednesday