Merge sort

What is a merge sort?

  • Merge sort is an example of a divide and conquer algorithm that reduces the size of the problem into smaller problems that are more easily solvable by an algorithm
  • Specifically, the merge sort divides the list in half, creating** two sub lists**. Each sub list is then divided into two until each sub list contains one item or element
  • Groups of two sub lists are then **sorted **and merged together to form new, larger sub lists until all sub lists have been merged into a single sorted list
  • The merge sort therefore contains two parts:
    • Part one: Divide the list into sub lists until each sub list contains only one element
    • Part two: Repeatedly merge groups of two and sort them, until all lists have been merged into a single sorted list
  • The merge sort has been illustrated below using a list of ten items: [15, 5, 2, 7, 4, 9, 10, 1, 8, 3]

performing-the-merge-sort-revision-notes-computer-science Figure 1: Performing the Merge sort

  • Notice that halving the list sometimes produces odd numbers of elements in sub lists. When dividing, all sub lists must contain a single element before merging can begin.
  • When merging, only two sub lists can be merged at a time. Any left over sub lists are merged in the next merging iteration
  • In order to merge two sub lists, two pointers are required, one for each sub list. The pointer keeps track of which elements are to be compared. Once an element has been merged into the new list, the corresponding pointer is incremented. The process continues until a list contains no elements to compare, at which point the remaining elements are appended to the end of the new merged sub list.

Time complexity of a merge sort

  • To determine the algorithm’s execution time as measured by the number of instructions or steps carried out Big-O notation must be used
  • The time complexity of a merge sort is dependent on its divide and conquer nature and the fact that n sub lists must be merged. This means the overall worst case time complexity of the merge sort is
  • The best case and average case scenario time complexity of the merge sort is as regardless of the number of elements or item, n must still be merged and halved

Space complexity of a merge sort

  • The merge sort requires twice the amount of space of other sorting algorithms due to holding copies of the left and right halves of the list
  • Its space complexity is therefore original space and an additional space to store the copies of the list

Tracing a merge sort

  • In the trace table below, several calls to merge sort are made which splits the list. Each call generates a stack frame which is put on the stack. Stack frames are removed when the current stack frame reaches the end of its code
  • The list is gradually broken down into sub lists, starting with the left side of each sub list. Sub lists are created until each sub list contains only one element which is the base case. When a base case is reached the stack frame is removed and the program continues execution on the line after the call to merge sort
  • Each sub list is gradually merged, two at a time and passed up the stack to be merged with another sub list in a later merge sort call
  • Pointers are incremented during the merge such that the next element to merge in each sub list is tracked and appropriately added to the new list. NewListPointer tracks the position of elements in the new list

Trace table for the merge sort

RowStack framelen(list) > 1?leftrightSplit left, right or merge?newListmidLeft PointerRight PointernewListPointer
11TRUE15, 5, 2, 7, 49, 10, 1, 8, 3LEFT 5   
22TRUE15, 52, 7, 4LEFT 2   
33TRUE155LEFT 0   
44FALSE  BASE CASE     
53TRUE155RIGHT 0   
64FALSE  BASE CASE     
73TRUE155MERGE[5, 15]0000
83TRUE27, 4LEFT 1   
94FALSE  BASE CASE     
103TRUE74LEFT 0   
114FALSE  BASE CASE     
123TRUE74RIGHT 0   
134FALSE  BASE CASE     
143TRUE74MERGE[4, 7]0000
153TRUE24, 7MERGE 1000
16     [2] 1 1
17     [2, 4]  12
18     [2, 4, 7]  23
193TRUE5, 152, 4, 7MERGE 2000
20     [2]  11
21     [2, 4]  22
22     [2, 4, 5] 123
23     [2, 4, 5, 7] 134
24     [2, 4, 5, 7, 15] 145
252TRUE9, 101, 8 3LEFT 2   

Examiner Tips and Tricks

Merge sort is a complex algorithm. You will not be asked to derive its time or space complexity in detail.

  • You will however be expected to know and justify its time and space complexity

Implementing a merge sort

Pseudocode

procedure mergesort(list)
	if len(list) > 1 then
		mid = len(list) div 2
		left = list[:mid]
		right = list[mid:]
		
		mergesort(left)
		mergesort(right)
		
		leftPointer = 0
		rightPointer = 0
		newListPointer = 0
		while leftPointer < len(left) and rightPointer < len(right)
			if left[leftPointer] < right[rightPointer] then
				list[newListPointer] = left[leftPointer]
				leftPointer = leftPointer + 1
			else
				list[newListPointer] = right[rightPointer]
				rightPointer = rightPointer + 1
			endif
		
			newListPointer = newListPointer + 1
		endwhile
		
		while leftPointer < len(left)
			list[newListPointer] = left[leftPointer]
			leftPointer = leftPointer + 1
			newListPointer = newListPointer + 1
		endwhile

		while rightPointer < len(right)
			list[newListPointer] = right[rightPointer]
			rightPointer = rightPointer + 1
			newListPointer = newListPointer + 1
		endwhile
	endif
endprocedure

list = [15, 5, 2, 7, 4, 9, 10, 1, 8, 3]
mergesort(list)
print(list)
  • It is important to note that merge sort relies on recursion to function
  • To divide the list into smaller sub lists, the merge sort algorithm must call itself repeatedly, using a base case of one element
  • Each recursive call is added to the call stack
  • Once the list is fully divided into individual elements, the stack frames are removed (popped off) and processed in reverse order to merge the sub lists back together in sorted order
  • It is also important to understand the concept of slicing in Python:
    • [:mid] means all elements up to, but not including, the mid index
    • [mid:] means all elements from the mid index to the end of the list
  • This list-splitting technique is known as slicing
  • Once the list has been split, and no more recursive calls are made, three new variables are defined per stack frame:
    • leftPointer (keeps track of the left sub list)
    • rightPointer (keeps track of the right sub list)
    • newListPointer (keeps track of the next available space in the sorted list)
  • The new list is merged in two stages:
    • Stage one: Merge all elements until one list is empty
      • The first while loop manages this process. The left and right pointers increment over the left and right lists, comparing elements one at a time. If one is smaller than the other then it is placed in the next available space in the new list
    • Stage two: Append the non-empty list’s remaining elements to the new list
      • This is managed by the final two while loops. One merges if the left half has remaining elements. The other merges if the right half has remaining elements. Only one will be called, not both

Python

def merge_sort(list):
  if len(list) > 1:
    mid = len(list) // 2
    left = list[:mid]
    right = list[mid:]
    merge_sort(left)
    merge_sort(right)
    left_pointer, right_pointer, new_list_pointer = 0, 0, 0
    new_list = [None] * len(list)
    while left_pointer < len(left) and right_pointer < len(right):
      if left[left_pointer] < right[right_pointer]:
        new_list[new_list_pointer] = left[left_pointer]
        left_pointer = left_pointer + 1
      else:
        new_list[new_list_pointer] = right[right_pointer]
        right_pointer = right_pointer + 1
      new_list_pointer = new_list_pointer + 1
    while left_pointer < len(left):
      new_list[new_list_pointer] = left[left_pointer]
      left_pointer = left_pointer + 1
      new_list_pointer = new_list_pointer + 1
    while right_pointer < len(right):
      new_list[new_list_pointer] = right[right_pointer]
      right_pointer = right_pointer + 1
      new_list_pointer = new_list_pointer + 1
    list[:] = new_list
  return list

list = [15, 5, 2, 7, 4, 9, 10, 1, 8, 3]
merge_sort(list)
print(list)  # Output: [1, 2, 3, 4, 5, 7, 8, 9, 10, 15]

Java

public class MergeSort {

   public static void mergeSort(int[] data) {
      if (data.length > 1) {
         int mid = data.length / 2;
         int[] left = new int[mid];
         int[] right = new int[data.length - mid];
         System.arraycopy(data, 0, left, 0, mid);
         System.arraycopy(data, mid, right, 0, data.length - mid);
         mergeSort(left);
         mergeSort(right);
         merge(data, left, right);
      }
   }

   private static void merge(int[] data, int[] left, int[] right) {
      int leftPointer = 0, rightPointer = 0, newListPointer = 0;
      while (leftPointer < left.length && rightPointer < right.length) {
         if (left[leftPointer] < right[rightPointer]) {
            data[newListPointer] = left[leftPointer];
            leftPointer++;
         } else {
            data[newListPointer] = right[rightPointer];
            rightPointer++;
         }
         newListPointer++;
      }
      System.arraycopy(left, leftPointer, data, newListPointer, left.length - leftPointer);
      System.arraycopy(right, rightPointer, data, newListPointer, right.length - rightPointer);
   }

   public static void main(String[] args) {
      int[] data = {15, 5, 2, 7, 4, 9, 10, 1, 8, 3};
      mergeSort(data);
      System.out.println(Arrays.toString(data));  // Output: [1, 2, 3, 4, 5, 7, 8, 9, 10, 15]
   }
}