Welcome to Lesson 18!

Learning Objectives
By the end of this class, you should know...
  • Bubble sort
  • Merge sort
  • Bucket sort
  • Quick sort (If time or you can watch video if interested)


Announcements

  • Final Exam next class
    • Cumulative
    • See Final Exam Review Guide with answer key
  • Submit Project Report 4
  • Project Report 5 and second peer evaluation due one week from today!
  • Last day to finish up any in-person business with me is this Thursday's office hour at 2:30pm
    • No Office Hours during Finals Week
  • Final Presentation Schedule
    • No time to waste on that day, so please be set up and ready to present BEFORE 11:30!



Review Activity

With a partner, trace DFS on the following graph:



  • What is the Big-O runtime of BFS and DFS?

Sorting Algorithms
  • Used to put list of values in order
    • numerical, lexographic or other (how to compare two objects from the same class?)
  • Typical run-times of sorting algorithms:
    • Worst class of sorting algorithms: average O(n2) or above
    • Best class of sorting algorithms: average O(n log n)
    • Average of O(n) is not possible for comparison-based algorithms
      • Why?
    • Non-comparison based sorting algorithms can achieve better average results:
      • Radix sort: O(nk)
      • Bucket sort: O(n + k)
  • Recursive and non-recursive
  • Comparison-based vs parallel
    • Divide and conquer
  • In-place algorithms (sorted without needing extra memory)
    • Example: heap sort
  • My experience with interviews and what I have heard:
    •  #1 to know is merge sort
    •  Should have merge sort and quick sort memorized when you interview
    • Will ask you to have merge sort (wrapper not helper) memorized for the final


 Algorithm Big-O Comparison-Based Parallel Divide and Conquer
 In-Place
Bubble Sort
n2
Yes
No
No
Yes
 Heap Sort
nlog(n)
Yes
No
No
Yes
 Merge Sort
nlog(n)
Yes
No
Yes
No
 Bucket Sort
n+k
 No YesYes
No




Sorting Algorithms

  • Compare all pairs of values
  • Swap any pair where the first value is larger than the second value


Bubble Sort Animation


Bubble Sort

Bubble Sort Pseudocode:

BubbleSort(A[], size)
    for i = 0 to size - 1
        for j = 0 to size - 1
            if A[j] > A[j+1]
               A[j] <---> A[j+1] //swap

Advantage:
  • Easy to write and remember
Disadvantage:
  • Very sloooowww
  • Run-times:
    • Best-case: O(n)
    • Average-case: O(n2)
    • Worst-case: O(n2)
  • Never used in real life applications

Merge Sort
  • Recursive "Divide and Conquer" Algorithm
  • Divide and Conquer is a common approach to problems in Computer Science
    • Divide (the problem into a small number of pieces)
    • Conquer (solve each piece, by applying divide-and-conquer recursively to it)
    • Combine (the pieces together into a global solution)
  • Merge Sort is one of the best known Divide and Conquer Algorithms
    • Divide: Split A down the middle into two subsequences, each of size roughly n/2. 
    • Conquer: Sort each subsequence (by calling MergeSort recursively on each). 
    • Combine: Merge the two sorted subsequences into a single sorted list.
  • The dividing process ends when we have split the subsequences down to a single item. 
    • A sequence of length one is trivially sorted.
  • We can visualize Merge Sort's Divide and Conquer approach through the image below:

Merge Sort Performed on an Array 



Merge Sort Animation

Merge-sort-example-300px.gif

Merge Sort Pseudocode

MergeSort(A[], low, high)
    if (low < high)
        middle = (low + high) / 2 //midpoint formula
        MergeSort(A[], low, middle)
        MergeSort(A[], middle+1, high)
        Merge(A[], low, middle, high)

Merge(A[], low, mid, high)
    l = low
    m = mid + 1
    index = low //current index of result array
    result[low...high] //will store sorted values
    while (l <= mid AND m <= high) //while both subarrays are non empty
        if(A[l] < A[m]) //A[l] is smaller value so copy into result
            result[index] = A[l]
            index++
            l++
        else
            result[index] = A[m]
            index++
            m++
    while (l <= mid) //right subarray is empty so copy over all of left side values
        result[index] = A[l]
        index++
        l++
    while (m <= high) //left subarray is empty so copy over all right side values
        result[index] = A[m]
        index++
        m++
    Copy(A, result) //copy contents of result into original Array A

  • Note: The use of the temporary array result here is a necessary evil, as merge sort requires the use of a temporary array at some point in the algorithm. 
    • It is one of the shortcomings of MergeSort, compared to some of the other efficient sorting algorithms.
    • In other words, it is not an "In Place Algorithm"
      • In contrast to Heap Sort, where the array is sorted without requiring extra memory
  • This version of mergeSort minimizes the creation of temporary arrays.

Time Complexity of Merge Sort
  • O(n logn)

Merge Sort with Cards


Bucket Sort
  • No longer a comparison-based sorting algorithm
  • Instead a parallel sorting algorithm
  • Sort values into individual "buckets"
  • Re-arrange them within the bucket into a sorted order
  • Combine buckets back into array or list
  • Here we scatter (into buckets) and gather (back into the array) rather than divide and conquer
  • Mainly useful when input is uniformly distributed over a range


Visualizing Bucket Sort

Scatter elements into their respective bins

Sort elements inside of bins and return to the array

Bucket Sort In Under 40 Seconds
Bucket Sort Pseudocode

bucketSort(A[], k) //pass in k = number buckets
    List buckets[k] //an array of Lists or 2D array to be our buckets
    for i = 0 to array_size-1
        insert array[i] into buckets[correctBucket(array[i])]
    for i = 0 to k-1 //for all the buckets
        sort(buckets[i])//call sorting algorithm of choice
    return concat(buckets[0]...buckets[k-1])
    

Time Complexity of Bucket Sort
  • Best Case: O(n + k) //k = number of buckets
  • Worst Case: O(n2) //if all elements in one bucket and sort with slow sorting algorithm
  • Average Case: O(n + k)

Sorting Algorithm Race


Quick Sort
  • Another Divide and Conquer algorithm
    • Pick a "pivot point". Picking a good pivot point can greatly affect the running time.
    • Break the list into two lists: those elements less than the pivot element, and those elements greater than the pivot element.
    • Recursively sort each of the smaller lists.
    • Make one big list: the 'smallers' list, the pivot points, and the 'biggers' list.


Quick Sort Pseudocode Explanation


Quick Sort Performed on an Array
(using last element as pivot)


Quick Sort with Hungarian Folk Dancers

Quick Sort Pseudocode
Quicksort(A[], low, high) 
    if (low < high)
        pivot_location = Partition(A[],low,high) 
        Quicksort(A[], low, pivot_location) 
        Quicksort(A[], pivot_location + 1, high) 

Partition(A[], low, high) 
    pivot = A[high] //choose largest value for pivot location 
    leftwall = low - 1 //move leftwall around instead of pivot
    for i = low + 1 to high 
        if A[i] < pivot //compare each value to the pivot
            leftwall++
            A[i] <---> A[leftwall] //swap
    A[low] <---> A[leftwall+1] //swap to put pivot (A[low]) in proper location
    return leftwall



Time Complexity of Quick Sort
  • Best Case: O(n log n) 
  • Worst Case: O(n2)
  • Average Case: O(n log n)
  • In practice is faster than Merge Sort
Worst Case IntuitionSuppose the pivot element splits the array as unequally as possible., with n elements in one half of the array and 0 in the other (ie pivot is biggest or smallest element in the array). Will result in every element compared to every element: n2

Average Case Intuition: Suppose, 
we pick the pivot element at random in an array of n keys.. The pivot element is from positions 1/4n. to 3/4.n.. 

1-----1/4n--------3/4n------n

Thus, the larger remaining subarray contains at most .3/4n. elements.. What is the maximum number of partitions we need to get from 3/4n elements down to 1 element.?

(3/4)levels * n = 1
n = (4/3)levels
log n = levels * log(4/3)
levels = log n * (1/log(4/3))
levels < 2 log n

Wrap Up
  • With your partner, discuss the 4 algorithms we learned today in class.
    • How do they sort a list of values?
    • What are the Big-O run-times in best, worst and average case for each algorithm

Upcoming Assignments

  • Final Exam Next Class
  • Final Project Due Next Week