Algorithms in PHP: Priority Queues (and Heaps)

This is the beginning of a series on implementing various algorithms in idiomatic PHP and their reference implementation (usually based on C++).

Today we’re going to be talking about Priority Queues. A priority queue is where each value has an associated “priority” that determines which order things come out of the queue. The best real-life priority queue would be an emergency room, where those patients with the most life-threatening issues are seen before the rest.

In most reference implementations, a priority queue is implemented using a Heap. A Heap is a binary tree-based structure that retains a specific value at the top of the tree. If the value is the max value of all the values, it is a “Max Heap,” and if it is the least, it is a “Min Heap.” Since we have a concept of “low priority” and “high priority,” we’ll consider a “Max Heap” here.

When you add an item to a Heap, you append it to the end of the Heap. If the newly appended value is lower than the parent, nothing needs to be done, else swap it with the parent and continue the check from that point. There are plenty of excellent articles on implementing a Heap, and we only need them to implement the “reference” implementation; for the idiomatic version of a priority queue, we don’t need a stinking Heap. 🙂

Interface

We’re only going to be implementing a few public functions:

<?php

namespace Withinboredom\Algorithms\Heap;

interface Heapish {
    public function insert(int $priority, mixed $value): void;
    public function removeMax(): mixed;
    public function getSize(): int;
    public function getMax(): mixed;
}

insert obviously inserts a new value with a given priority, removeMax returns the value with the highest priority and removes it from the queue, and getMax just peeks at the highest priority value without removing it from the queue.

The reference implementation

<?php

namespace Withinboredom\Algorithms\Heap;

/**
 * A basic priority heap implemented in C++-ish semantics
 */
class Reference
{
    private int $heapSize = 0;
    private array $arr = [];

    public function __construct(public readonly int $maxSize)
    {
    }

    public function getMax(): mixed
    {
        return $this->arr[0]['value'];
    }

    public function getSize(): int
    {
        return $this->heapSize;
    }

    public function insert(int $key, mixed $value): void
    {
        if ($this->heapSize === $this->maxSize) {
            throw new \Exception("Heap is full");
        }

        $i = $this->heapSize++;
        $this->arr[$i] = ['key' => $key, 'value' => $value];

        while ($i !== 0 && $this->arr[$this->parent($i)]['key'] < $this->arr[$i]['key']) {
            $this->swap($i, $this->parent($i));
            $i = $this->parent($i);
        }
    }

    private function parent(int $i): int
    {
        return (int)floor(($i - 1) / 2);
    }

    private function swap(int $first, int $second)
    {
        $temp = $this->arr[$first];
        $this->arr[$first] = $this->arr[$second];
        $this->arr[$second] = $temp;
    }

    public function deleteKey(int $i)
    {
        $this->increaseKey($i, PHP_INT_MAX);
        $this->removeMax();
    }

    public function increaseKey(int $i, int $newKey)
    {
        $this->arr[$i]['key'] = $newKey;

        while ($i !== 0 && $this->arr[$this->parent($i)]['key'] < $this->arr[$i]['key']) {
            $this->swap($i, $this->parent($i));
            $i = $this->parent($i);
        }
    }

    public function removeMax(): mixed
    {
        if ($this->heapSize <= 0) {
            throw new \Exception("Heap is empty");
        }
        if ($this->heapSize === 1) {
            $this->heapSize--;
            return $this->arr[0];
        }

        $root = $this->arr[0];
        $this->arr[0] = $this->arr[--$this->heapSize];
        $this->maxHeapify(0);

        return $root;
    }

    public function maxHeapify(int $i): void
    {
        $l = $this->leftChild($i);
        $r = $this->rightChild($i);
        $largest = $i;

        if ($l < $this->heapSize && $this->arr[$l]['key'] > $this->arr[$i]['key']) {
            $largest = $l;
        }

        if ($r < $this->heapSize && $this->arr[$r]['key'] > $this->arr[$largest]['key']) {
            $largest = $r;
        }

        if ($largest !== $i) {
            $this->swap($i, $largest);
            $this->maxHeapify($largest);
        }
    }

    private function leftChild(int $i): int
    {
        return 2 * $i + 1;
    }

    private function rightChild(int $i): int
    {
        return 2 * $i + 2;
    }
}

As you can see, the implementation is not very “php-ish” and does some pretty standard-ish things with arrays. However, this is based on an implementation in C++. In C++, arrays are very different than in PHP! In PHP, arrays can be sparse, sorted, or unsorted, and more importantly, we can access the underlying iteration mechanics to store state!

Looking over this implementation, you may notice a few other drawbacks. One, this implementation needs to be sorted on every insert, which incurs a O(log n) cost, and is also incurred when removing the max item from the Heap. Let’s apply some PHP shenanigans to get it much faster…

Utilizing the underlying cursor

The first optimization we can make is to utilize the array’s cursor and not sort on every insert. To do that, we need to keep track of whether or not an item has been inserted or not. So we’ll create a $dirty variable, so we only need to sort right before we read from the queue. This will significantly simplify our inserts. Oh, since we don’t have to worry about array sizing, we can also toss out a potential exception:

    public function insert(int $key, mixed $value): void
    {
        $this->arr[$key][] = $value;
        $this->dirty = true;
    }

Note that we append to an array with the value. This allows us to have multiple values with the same priority and retrieve them in the same order we insert them (FIFO — first in, first out), just like with a regular queue (our reference implementation does a FILO — first in, last out — implementation which doesn’t seem right to me).

Deletion can now be simplified as well:

    public function deleteKey(int $i): void {
        unset($this->arr[$i]);
    }

We don’t need to mark the array as dirty because it is already sorted or will be sorted when it is read. For simplicity, we aren’t concerned with multiple values here, and will likely be an application-level decision on how to handle that.

We also don’t need to keep track of the size of the queue anymore; we can just use the built-in function:

    public function getSize(): int
    {
        return count($this->arr);
    }

To get the max priority value, we first need to check if the queue is dirty. If it is dirty, we have to sort it. Then return the item:

    public function getMax(): mixed
    {
        if ($this->dirty) {
            krsort($this->arr);
            $this->dirty = false;
        }

        $next = current($this->arr);
        return $next[0];
    }

It is very similar to removing the item from the queue as well; however, we have to rebuild the inner queue for a given priority if there is more than one item in the queue:

public function removeMax(): mixed
    {
        if ($this->getSize() === 0) {
            throw new \Exception("Heap is empty");
        }

        if ($this->dirty) {
            krsort($this->arr);
            $this->dirty = false;
        }

        $current = key($this->arr);
        $return = current($this->arr);
        if (count($return) === 1) {
            unset($this->arr[$current]);
            return $return[0];
        }

        $data = $return[0];
        unset($return[0]);
        $this->arr[$current] = array_values($return);
        return $data;
    }

You may be tempted to use an internal pointer for the little queues, but it will only save you a few microseconds unless you have very dense priorities (lots of values with the same priority).

    public function removeMax(): mixed
    {
        if ($this->getSize() === 0) {
            throw new \Exception("Heap is empty");
        }

        if ($this->dirty) {
            krsort($this->arr);
            $this->dirty = false;
        }

        $current = key($this->arr);
        $x = &$this->arr[$current];
        $value = current($x);
        $next = next($x);
        if($next === false) {
            unset($this->arr[$current]);
        }

        return $value;
    }

Here’s the complete listing:

<?php

namespace Withinboredom\Algorithms\Heap;

/**
 * An idiomatic PHP implementation of a heap.
 *
 * We are using a cursor here because we are utilizing the magic of sparse arrays!
 */
class Phpized
{
    private array $arr = [];
    private bool $dirty = false;

    public function __construct()
    {
    }

    public function insert(int $key, mixed $value): void
    {
        $this->arr[$key][] = $value;
        $this->dirty = true;
    }

    public function deleteKey(int $i): void {
        unset($this->arr[$i]);
    }

    public function removeMax(): mixed
    {
        if ($this->getSize() === 0) {
            throw new \Exception("Heap is empty");
        }

        if ($this->dirty) {
            krsort($this->arr);
            $this->dirty = false;
        }

        $current = key($this->arr);
        $x = &$this->arr[$current];
        $value = current($x);
        $next = next($x);
        if($next === false) {
            unset($this->arr[$current]);
        }

        return $value;
    }

    public function getSize(): int
    {
        return count($this->arr);
    }

    public function getMax(): mixed
    {
        if ($this->dirty) {
            krsort($this->arr);
            $this->dirty = false;
        }

        $next = current($this->arr);
        return $next[0];
    }
}

And a simple unit test:

<?php

use Withinboredom\Algorithms\Heap\Phpized;

it('inserts and removes items in the order they are inserted', function () {
    $heap = new Phpized();
    for ($i = 0; $i < 1000; $i++) {
        $heap->insert(12, $i);
    }

    $heap->insert(15, 10);
    $heap->insert(0, 12);
    expect($heap->removeMax())->toBe(10);

    for ($i = 0; $i < 1000; $i++) {
        expect($heap->removeMax())->toBe($i);
    }

    expect($heap->removeMax())->toBe(12);

    expect($heap->getSize())->toBe(0);
    expect($heap->removeMax(...))->toThrow(\Exception::class);
});

Comparisons

I used PHP Bench to microbenchmark the two implementations alongside PHP’s SPL implementation. Naturally, the benchmarks are useless in their raw form but valuable for comparison to each other.

+----------------+-----------+----------+-----------+-----------+-----------+
| subject        | mode      | stdev    | best      | worst     | mem_peak  |
+----------------+-----------+----------+-----------+-----------+-----------+
| benchReference | 998.044μs | 42.644μs | 950.360μs | 1.275ms   | 931.560kb |
| benchSpl       | 301.795μs | 6.961μs  | 294.900μs | 326.080μs | 574.280kb |
| benchPhpized   | 275.860μs | 7.220μs  | 266.890μs | 302.150μs | 574.280kb |
+----------------+-----------+----------+-----------+-----------+-----------+

What we can see is that this “phpized” implementation of a priority queue is faster than the C-based implementation or at least just as fast, even utilizing the same amount of memory. These implementations were run with the CLI opcache turned on, JIT enabled, and with dense priorities.

PHP can be faster than you expect

PHP arrays are downright powerful, allowing you to simplify some traditionally complex code, even totally abandoning traditional algorithms altogether in some instances (hint: there’s more to come!).

This is just the beginning of this series, and I hope you’ve found it as exciting and enlightening as I did building it! If you have any questions or suggestions, feel free to ask them in the comments!


Discover more from Somewhere Within Boredom

Subscribe to get the latest posts sent to your email.

Comments

4 responses to “Algorithms in PHP: Priority Queues (and Heaps)”

  1. Craig Avatar

    Great example, thanks!

  2. Kamran Avatar
    Kamran

    Very informative and interesting brother. Keep it up.

  3. Nick Poulos Avatar
    Nick Poulos

    Cool article, looking forward to the next.

  4. Rye Avatar
    Rye

    getMax uses index 0. This is not the current value, but always the first inserted one.

    The getSize function only gives the count of the keys.
    When you inserting two values with the same key, the count would be wrong

    removeMax does not remove the value, only when the pointer has reached the end.

    Your test does not test for any of these things.