Thoughts on PHP Fibers

The last couple of days of my winter vacation, I wanted to really understand PHP 8’s fibers. To do so, I attempted to implement RAFT simply by reading the dissertation on it.

My key takeaway was that it has been a really long time since I’ve tried to implement something like this by reading plain ole’ English. The paper is really well written but things are all over the paper, requiring actual study. I did get it working, with one edge case unfinished.

But I’m not here to talk about RAFT, as interesting as it is. Originally, I started with “pure fibers” and php-ev, using named pipes to/from PHP-FPM to communicate RPCs. That turned out to be quite cumbersome in a dockerized environment, so I turned to amphp, particularly the v3 branch.

At first, I really loved fibers. There was no Task like in C#, async/await, or promises. Things just magically worked how you’d expect them to until they didn’t.

Fibers

Fibers in PHP are a very interesting concept. They allow you to simply pause execution, passing some state out of the fiber in the process, using Fiber::suspend($state). This gives you an incredible amount of flexibility and essentially has transparent async.

Amphp is an event loop and sugar over fibers, giving you quite a bit of real-life saneness to them. PHP doesn’t have any built-in mutexes or semaphores, so you’ll need some library like it since you need that capability when dealing with async technology and having shared data structures.

The bad fibers

I eventually came to realize that most of the bugs I was experiencing were classic multi-threaded memory access type bugs. These are bugs where you start a function with a value being one thing, enter a branch due to it being that one thing, then later enter a different branch because the value changed. Consider the following example:

<?php
function() {
  if($this->isTrue) {
    doRequest('http://example.com/true');
  }
  if(!$this->isTrue) {
    doRequest('https://example.com/false');
  }
}

As a reader of single-threaded code, it’s quite clear that only one of these branches should ever be entered during a given call. However, fibers are totally transparent, and either (or both!) requests may end up changing the value if another fiber changes the value without you realizing it. This makes dropping fibers into old code rather dangerous.

When dealing with normal web-application semantics, this probably isn’t as much of a problem, though I expect if internal PHP functions start taking advantage of fibers, this will become more and more of a problem.

In my case, I simply used amphp’s synchronization platform to create a mutex and guarded shared state religiously.

However, there’s literally no indication that something uses a fiber under the hood. You have no idea what is running when and how the state is changing without tracing it manually to figure out when or if a fiber will suspend. Again, I don’t think this is as much of an issue in traditional apps, but it was for implementing something like RAFT, which requires millisecond precision with at least one request every 10-15 milliseconds (running full-speed) but I was running at a tenth that speed — or less — so I could follow what was going on.

Overall, I was impressed with how clean the code looked compared to async/await and/or promises.

When to use fibers?

Fibers are quite powerful and their transparency can be a boon, especially for stateful servers. If you don’t need state between requests, or an event loop, I’m not sure that fibers will create any meaningful performance gains. But holy cow, if you have a stateful service, these will be a game-changer.

Until fibers start being used in core PHP i/o, you have to use a library like amphp v3+ (or implement your own!) to really take advantage of what they have to offer. Even then, you also have to refactor your code to take true advantage of it, by spinning off coroutines to work on things in parallel. For example, WordPress’s actions could be reimplemented to fire “all at once”, instead of sequentially. So if an action does some i/o, another action can do some work while the first action waits for the i/o to complete. I think we’d really have to wait for some kind of “fiber scheduler” to really take advantage of that though.

Until next time…