Priority Queue in Go - Part 2 of n

August 15, 2024

In the previous installment of this series I provided an overview of what a priority queue is, how it can be implemented using an array as backing storage, and how a heap can be represented as a tree within it. Additionally, I presented the implementation of the “Add” method and how the tree is updated such that it retains the max heap property after a new item is added. In this installment I will implement the “Pull” operation and likewise show how to update the tree after the highest priority is removed. Before doing so, I will recap the internal state of queue after a few items have been added.

The Current State of an Example Queue

Recall the state of the array in which items are being stored. After adding items with the priorities 5, 25, 1, 13, 16, 23, the array was in the following state. The size of the queue is also shown, indicating the last item stored in the fixed-size array.

Also recall the following three properties.

The highest priority item is at the zeroith index.
Other than the highest priority item, the items in the array are unsorted
The tree represented by the array is a balanced binary tree

After an item is pulled, the array must be returned to a state of having the highest priority item at the zeroith index. This implies some operation must be performed after getting a reference to the zeroith item. This operation is called variously down-heap, sink-down, cascade-down, bubble-down and others. These steps will be elaborated in the following section.

The Pull Operation

The high-level sequence of steps to pull an item from the queue as follows.

Decrease the size by one.
Swap the last and the zeroith items.
Perform down heapify.

This section describes these operations in detail while showing the implementation. But before doing so, as in the previous section I will first write the test to verify the expected behavior. Here is the test using the previous example.

package hyperds

import (
	"testing"
)

func TestItemsArePulledInPriorityOrder(t *testing.T) {
	unit := NewPriorityQueue[string](100)

	unit.Add(5, "val-5")
	unit.Add(25, "val-25")
	unit.Add(1, "val-1")
	unit.Add(13, "val-13")
	unit.Add(16, "val-16")
	unit.Add(23, "val-23")

	assertPulledValue(t, unit, "val-25", 5)
	assertPulledValue(t, unit, "val-23", 4)
	assertPulledValue(t, unit, "val-16", 3)
	assertPulledValue(t, unit, "val-13", 2)
	assertPulledValue(t, unit, "val-5", 1)
	assertPulledValue(t, unit, "val-1", 0)
}

func assertPulledValue(t *testing.T, unit *PriorityQueue[string], expected string, expectedSize uint64) {
	if val, _ := unit.Pull(); *val != expected {
		t.Errorf("expected %s, was %s", expected, *val)
	}

	if size := unit.Size(); size != expectedSize {
		t.Errorf("Expected size %d, was %d", expectedSize, size)
	}
}

This test repeats the addition of the values and their corresponding priorities. It then makes assertions about what value is pulled, and the size after doing so. Naturally this test will fail to compile because Pull has not yet been implemented. Here is the outline of the method, which will be explained in detail. (For the moment, the fact the queue might be empty will be ignored.)

func (p *PriorityQueue[T]) Pull() (*T, error) {
	p.size--
	p.swap(0, p.size)
	p.downHeapify(0)
	result := p.vals[p.size]
	return &result.val, nil
}

The first step is to decrease the size of the queue. This makes it possible to work with the array as if the value has already been removed. This puts the array into the following state.

The zeroith and last item are then swapped. This, along with reducing the size of the queue by one, makes it possible to reference the value that will be returned after performing the down-heapify operation. Because the size has been reduced, it will not be re-sorted to the top of the queue. After the swap, the array will be in the following state.

The tree can now be rebalanced by implementing downHeapify, starting at the zeroith index. The procedure to retain the heap property is:

Starting at the root, set it to be the highest priority seen.
If the highest priority is less than the left node, set the highest priority to be the left node. The left node is always at 2i + 1 where i is the current index.
If the highest priority is less than the right node, set the highest priority to be the right node. The right node is always at left + 1.
If the highest priority is still the index, the current node already retains the heap property and no further operations are required.
Swap the current index with the largest node.
Set the current index to the largest node.
Return to step 2

Here is the method in full.

func (p *PriorityQueue[T]) downHeapify(i uint64) {
	idx := i
	largest := idx

	for 2*idx < p.size {
		left := 2*idx + 1
		right := left + 1

		if left < p.size && p.less(largest, left) {
			largest = left
		}

		if right < p.size && p.less(largest, right) {
			largest = right
		}

		// Implies value at idx is neither less than the left nor the right child
		if largest == idx {
			break
		}

		p.swap(idx, largest)
		idx = largest
	}
}

With this method implemented, the test will pass. We can also verify that a single pull rebalances the tree. First, let’s examine the state of the array after pulling a single item.

The state of the array can be verified with the following test.

func TestPullRebalancesTree(t *testing.T) {
	unit := NewPriorityQueue[string](100)

	unit.Add(5, "val-5")
	unit.Add(25, "val-25")
	unit.Add(1, "val-1")
	unit.Add(13, "val-13")
	unit.Add(16, "val-16")
	unit.Add(23, "val-23")

	assertPulledValue(t, unit, "val-25", 5)
	assertValue(t, unit, 0, "val-23")
	assertValue(t, unit, 1, "val-16")
	assertValue(t, unit, 2, "val-1")
	assertValue(t, unit, 3, "val-5")
	assertValue(t, unit, 4, "val-13")
}

We are nearly done implementing Pull. At the moment it has a serious flaw however: If the queue is empty, the size will be zero and it will be decreased below zero - outside the range of values for an unsigned integer. This will be fixed in the next section.

Making Pull Safer

The current implementation of Pull has no error checking. If the queue is empty and Pull is called, the index will be out of range and runtime error will occur. It already returns to values, the second of which is an error. The method does not take advantage of this yet, however. To improve the situation, I will first write a test to verify the desired behavior. In this case it will be to return an error if the queue is empty. Here is the test.

func TestPullOnEmptyQueueReturnsError(t *testing.T) {
	unit := NewPriorityQueue[string](10)

	if _, err := unit.Pull(); err != ErrEmptyQueue {
		t.Error("Pull() on empty queue should have returned an error")
	}
}

To make this compile, I will first create the error type specified in the test.

const (
	errEmptyQueue = "Queue is empty"
)

type emptyQueueError struct {
}

func (e *emptyQueueError) Error() string {
	return errEmptyQueue
}

var ErrEmptyQueue error = &emptyQueueError{}

Then Pull can be updated as follows.

func (p *PriorityQueue[T]) Pull() (*T, error) {
	if p.Empty() {
		return nil, ErrEmptyQueue
	}

	p.size--
	p.swap(0, p.size)
	p.downHeapify(0)
	result := p.vals[p.size]
	return &result.val, nil
}

Note the method Empty is missing, but it is trivial to implement.

func (p *PriorityQueue[T]) Empty() bool {
	return p.size == 0
}

The test will now pass. To ensure this method can be tested on its own, I will add the following tests. The first test ensures that when Empty is called on an empty queue it is indeed true. The second tests shows that opposite; that calling Empty on a non-empty queue returns false. This level of testing my seem like unnecessarily pedantic. However, I generally err on the side of writing tests that verify the behavior of specific methods rather than relying on tests that are primarily for testing a different operation even when they also happen to test other methods implicitly. This is helpful when tests happen to fail after refactoring, helping to narrow in on the real source of the error.

func TestEmptyIsTrueIfSizeIsZero(t *testing.T) {
	unit := NewPriorityQueue[string](10)

	if !unit.Empty() {
		t.Error("queue should have been empty")
	}
}

func TestEmptyIsFalseIfSizeIsGTZero(t *testing.T) {
	unit := NewPriorityQueue[string](10)
	unit.Add(1, "test")

	if unit.Empty() {
		t.Error("queue should not be empty")
	}
}

With Empty implemented, we have the core methods of a priority queue complete. However, I want the ability to remove specific items from a queue by value. Perhaps after a job is added it is later cancelled, for example, and must be removed. That will be the subject for the next part of this series.

Tags: