Async Middleware
Matthew Weier O'Phinney
SunshinePHP 2020
Project Lead for the Laminas Project, and a Product Manager
and Principal Engineer at Zend by Perforce.
I led the Zend Framweork project for over 10 years, and
contributed from its inception. I led the effort to move it
into an open governance model.
PHP-FIG member, serving on Core Committee, assisting with
half-dozen specifications
Terminology
PHP-FIG (or FIG) : Framework Interop Group (standards body)
PSR-7 : HTTP Message interfaces
PSR-15 : HTTP Request
Handler/Middleware interfaces
PSR-17 : HTTP Message Factory interfaces
PSR-14 : Event Dispatcher interfaces
Goals
I will demonstrate how async enables:
Application-specific servers
Better performance, generally due to...
Deferred processing.
I will cover:
What async programming is
The Swoole extension
How to integrate a PSR-15 app into Swoole
Tools in the Mezzio ecosystem for working with Swoole
Pitfalls to watch out for
Outro: Before we dive in, though, we need to talk about
something that's not PHP.
Node.js
How many know Node.js?
it's a server-side JavaScript framework — yes, I said
server-side! — that provides the ability to create network
services, including servers.
event loop
Recieves messages
Enqueues messages
Dequeues and dispatches messages, processing
each synchronously
Each dispatch is a tick of the loop
Low-level engine detail
Why?
Why would you want an event-loop driven runtime?
PHP: Shared Nothing
Bootstraps. Every. Request.
Long-running processes delay the response.
Bootstrap each request (though with PHP 7.4,
we don't have to )
Route each request to the correct handler
Return a response
Tear it all down
GREAT for horizontal scaling
CLICK
Bootstrapping every request is expensive
CLICK
BAD for long-running processes
Message Queues
Application code enqueues a message
Queue worker processes the message
Two architectures to maintain
Async servers allow deferment
Parallel Processing
Aggregate results of several unrelated operations
Parallel processing > sequential processing
Summary: async servers provide better
performance . As in 4-7X better, baseline.
Async Programming
We know why we would use an async server. But
how do we program async systems?
Styles of Async Programming
Four general styles:
Callbacks
Promises
async/await
Coroutines
Deferment patterns
Callbacks
executeSomeAsyncProcess(
$withData,
$andA,
$callbackToExecuteOnCompletion
);
function ($error, $result)
{
if ($error) {
}
}
Code marks itself as asynchronous, and then accepts a callback
to execute on completion.
Callback is executed in the next available tick
CLICK
Convention for callbacks is to accept a
nullable error and a nullable result; the callback then operates
based on whether or not an error is present.
Process inversion occurs, as any processing
requiring the async process to complete now happens in the
callback.
Callback Hell
doFirst($payload, function ($err, $result) {
doSecond($result, function ($err, $result) {
doThird($result, function ($err, $result) {
// finally got what we needed
});
});
});
aka the "Pyramid of Doom"
The nesting exacerbates the problems of process inversion.
Patterns for making this easier to read exist - primarily,
having named callbacks - but you still end up with the pyramid
when composing the callbacks.
Promises
$promise = someAsyncOperationReturningAPromise($with, $data);
$promise
->then($someCallbackToExecuteOnResolution)
->then($someAdditionalCallbackToExecuteOnResolutionOfPrevious)
->catch($someCallbackToExecuteOnRejection);
where:
function then($successCallback = null, $rejectionCallback = null);
and catch() is a shortcut for:
$promise->then(null, $someCallbackToExecuteOnRejection);
Async operations produce a deferred object , which
CLICK
Returns a Promise — a "promise" that
resolution or rejection will
happen in the future.
If the promise resolves, then...
CLICK
(describe "then")
CLICK
catch is a simplified form of `then()` for
handling rejections
Acts like a filter chain
Single- level nesting
Still has the process inversion problem
async/await
async function ping() {
const res = await fetch('/api/ping');
return await res.json();
}
let ack = ping();
.NET, JavaScript and a few other languages have this.
Root of the pattern is a function returning an
implicit promise — a promise defined with its own
resolution or rejection callback.
An async function is any function that awaits
either another function that returns a promise, or another async
function.
Consumers write code that looks
synchronous
Useful when you have synchronous code
depending on asynchronous calls
Still requires special syntax
Coroutines
$result = $statement->execute($data);
Coroutines predate all other patterns
Solves the synchronous code depending on asynchronous
calls problem
Outro: at this point, we've covered what an async
server does, and a few different approaches to async
programming. But what about PHP? How do we do async in
PHP?
A number of solutions have emerged in userland, primarily around using PHP's
generators in order to defer execution in combination with streams and/or
extensions such as libevent, libev, and others.
However, there's another solution, a PHP extension called Swoole. Swoole is
developed primarily by Chinese developers who are working on large scale
distributed applications. Think Alibaba and Baidu.
In other words, Swoole has had some heavy production usage.
Swoole provides...
an event loop
async HTTP, network, and socket clients
network servers
Swoole provides a solution that looks a lot like Node.js
Async features are exposed via callbacks and
coroutines
Outro: which leads into the places it differs from
Node.js
Swoole features
Coroutine support for many TCP/UDP and socket operations
Spawning multiple workers per server
Spawning separate task workers
Daemonization of servers
Outro: So this is what Swoole offers. How do you work with it?
A Basic Web Server
use Swoole\Http\Server as HttpServer;
$server = new HttpServer('127.0.0.1', 9000);
$server->on('start', function ($server) {
echo "Server started at http://127.0.0.1:9000\n";
});
$server->on('request', function ($request, $response) {
$response->header('Content-Type', 'text/plain');
$response->end("Hello World\n");
});
$server->start();
Create a server, providing an address and port.
Listen to it start . This is required .
Have it listen to requests and do something
with the request and response.
Start the server
Note that the request listener has a call to the response
end()
method. This signals to the server that the
response is complete and can be sent back to the client.
You'll have to press Ctrl-C to halt it.
This example sends the same page for ANY request to the
server. What else might we want to do?
Deferment
$server->defer(function () {
// work to defer
});
Callback is passed to the event loop.
Eventually, the current worker will block on
the deferred operation.
Completion can leave the worker in an indeterminate state.
Outro: is there a better way to defer operations?
Task Workers
Remember how I mentioned task workers previously. Task workers act
as a message queue, and let you defer work in a way that doesn't
block the web workers.
Prepare the server for tasks
Configure the number of task workers to use (required!)
Register a listener to handle incoming tasks.
Register a listener to execute on task completion.
Task workers are not enabled by default
You need to provide how many to enable
a task listener that will process the task
a "finish" listener that will be invoked when
processing is complete.
Registering task workers
$server->set(['task_worker_num' => 4]);
$server->on('task', function ($server, $taskId, $data) {
// Handle task
// Finish task:
$server->finish('');
});
$server->on('finish', function ($server, $taskId, $returnValue) {
// Task is complete
});
The task listener must call finish()
The $returnValue passed to your "finish" listener is the value
you provide when calling finish()
Triggering a task
$server->task($someData);
The data you pass to the task() method is passed as the $data
argument to your task listener.
Outro: so now you know how to configure the server to
spawn task workers, and how to trigger a task. How should you
write your task listener, though?
A Basic TaskWorker
class Task {
public callable $handler;
public array $arguments;
}
$server->on('task', function ($server, $taskId, $task) {
if (! $task instanceof Task) {
$server->finish('');
return;
}
($task->handler)(...$task->arguments);
$server->finish('');
});
Allows for arbitrary callbacks with arbitrary numbers
of arguments
Other things you should do: log. Which leads us into
our next topic:
Code Reloading
or lack thereof
Once the Swoole webserver is running, it only loads code it
hasn't encountered before. So if you make changes to existing
code, you can't just refresh the browser.
There are some hot code reloading features in
current versions of Swoole, but in most cases, these will only
be available for code that is not bootstrapped on application
initialization.
Debugging
Coroutine support in Swoole is incompatible with
XDebug and XHProf!
You'll need to get good at logging!
Unit Testing
Mocking Swoole classes is difficult.
You'll do you and your team a favor if you can abstract the Swoole
code away from your business logic.
One Listener Per Event
Some will raise exceptions if you register a second listener.
Others will overwrite with the most recently registered listener.
$response->end() is Problematic
If you forget to call it:
the connection will remain open until a network timeout occurs;
the current process will remain open; which means
no next tick of the event loop.
Sometimes this is useful, particularly if you
really need to wait for the async operation to finish before
returning a response.
That said: maybe it's better to determine if
this should be done synchronously or via coroutine-enabled
functionality instead.
I consider this an antipattern , and it should
be abstracted away from the user to prevent problems occurring.
Non-Standard Request/Response API
It is none of:
PSR-7
Symfony HttpKernel
laminas-http / zend-http
In fact, it's most similar to Node's API
Which means you will need to adapt it to work with
your code or framework.
Swoole Pros and Cons
Pros
Cons
Multiple web workers
No hot-code reloading
Separate task workers
Unmockable classes
Coroutine support
Single-listener events
Bootstrap elimination
$response->end()
No web server
Non-standard request/response API
Performance gains
Like any async environment, it's a low-level engine
detail . You will want to abstract its
API to provide a productive development environment,
while still making use of its strong production features.
PSR-15
And this brings us to PSR-15, because it turns out that this is a really nice
abstraction for turning an incoming request into a response.
Request Handlers and Middleware
namespace Psr\Http\Server;
use Psr\Http\Message\ResponseInterface as Response;
use Psr\Http\Message\ServerRequestInterface as Request;
interface RequestHandlerInterface {
public function handle(Request $request) : Response;
}
interface MiddlewareInterface {
public function process(
Request $request,
RequestHandlerInterface $handler
) : Response;
}
Request handlers as entry points and
deepest level of the application.
Handler passed to middleware generally acts as a
queue , processing each middleware until it
encounters a handler.
Middleware Flow
This leads to a layered architecture similar like we have
in this onion diagram.
A request is passed to each layer, and that layer determines if
the request can advance to the next layer, or if a response can be
returned immediately.
Middleware + Swoole
$server->on('request', function ($request, $response) use ($app) {
$appResponse = $app->handle(transformRequest($request));
transformResponse($response, $appResponse);
});
As it turns out, middleware is an excellent way to handle an
incoming request.
Transform the Swoole request into a PSR-7 request
Transform the PSR-7 response into a Swoole response, and emit it
And this is where I finally get to talk about my pet project,
Mezzio, formerly Expressive, as it provides a PSR-15 middleware runner.
Mezzio
A middleware application runner, with...
Dependency Injection wiring abstraction
Routing abstraction
Template abstraction
Error handling abstraction
Application and per-route pipelines
The keyword here is abstraction ; Mezzio is a
thin glue layer between these, allowing you to choose the solutions
you're already familiar with when developing your application.
mezzio-swoole
We decided to codify the abstraction for running a PSR-15
middleware application by creating what we call an "HTTP handler
runner" . This code performs the logic of marshaling a PSR-7
request, passing it to an application, and emitting the response
returned.
mezzio-swoole is a specialized version of this
abstraction, designed to work with Swoole. What is important about
this is that it will work for any PSR-15
application .
Additionally, it provides features around configuring the Swoole HTTP server,
including setting up daemonization and task workers.
Starting your Mezzio+Swoole server
$ ./vendor/bin/mezzio-swoole start
Mezzio skeleton auto-detects the package
Package registers an HTTP handler runner
Start the server from the command line.
Write your middleware application just like you
normally would!
Configurable Server Features
Limited hot-code reloading
Static file serving
Hot code reloading requires the inotify
extension , and cannot reload any code that was necessary
to start the server itself. That includes:
configuration
middleware pipeline
routes
the HTTP server itself
Static file serving can be enabled for one or more directories
containing files you want to serve via your application. You can
even configure what types of files the server is allowed to serve,
as well as their associated MIME types.
Async Applications
Eliminating bootstrap operations
Deferred operations
Now that we have async-enabled our application, we can start
leveraging those async capabilities
We have already demonstrated bootstrap
elimination
Deferred Operations
Swoole offers us three ways to defer operations: coroutines,
event deferment, and task workers
Coroutines
$result = $mysqli->query($sql);
while ($data = $result->fetch_assoc()) {
// ...
}
As I noted earlier, Swoole implements coroutines for a swath of PHP
functionality. In particular, these include TCP/UDP
operations, including HTTP operations; socket
operations; and some PDO drivers (primarily MySQL).
This means that if your code makes use of these features, it
automatically benefits from the event loop, allowing your code to
look synchronous.
Deferment
$server->defer($callback);
All server types Swoole supports implement event loops, and allow
you to call the defer() method to defer operations to a later tick.
Task Workers
$server->task($someData);
Each of deferment and task workers require access to the
HTTP server instance . And this is a no-no .
You should abstract that functionality, to make
testing and debugging easier.
phly-swoole-taskworker
So, what can you do? Well, remember that task worker example I had
earlier? I abstracted that into its own package.
phly-swoole-taskworker Usage
use Phly\Swoole\TaskWorker\Task;
$server->task(new Task(function ($event) {
// handle the event
}, $event));
Package autoregisters the task worker when
used with mezzio-swoole
Constructor takes arbitrary number of
arguments , but the first is the callable
handler
However, you still need access to the server to create
a task.
Listener deferment
phly/phly-event-dispatcher
$listener = new DeferredListener($server, $listener);
// Where DeferredListener is equivalent to:
function (object $event) use ($server, $listener) : void {
$server->task(new Task($listener, $event));
}
In your own code:
$this->dispatcher->dispatch($someEvent);
I was part of the PSR-14 working
group , codifying Event Dispatchers
One thing we looked at was listener
deferment : having a listener enqueue itself with a
message queue or similar for later execution.
Your code then just dispatches events.
In development, use the listener itself.
In production, decorate the listener so that it defers itself.
Pitfalls
Just as with any technology, there are pitfalls.
Stateful Services
The biggest pitfall you'll run into is with stateful
services .
Let's take a step back.
One paradigm Mezzio pushes is that you should use a
dependency injection container , and that all
services should be retrieved from it. Routers are expected to fetch
handlers and middleware from this container as well.
This has huge benefits. Code is easier to test, it's easy to
determine what dependencies were used, and there's a single source
of truth for how an instance was created (generally a factory).
In fact, in a system where bootstrapping happens only once, DI
containers are even more interesting, because once a service is
created, the instance is cached, reducing the amount of time needed
on subsequent requests!
So, what's the pitfall, exactly?
It's in that last bit: the same service is used for each
subsequent request.
That means that any state changes in the service
propagate, and that can be problem. How?
Stateful Services: Templating
$template->addDefaultParam('*', 'user', $user);
$metadata = $resourceGenerator
->getMetadataMap()
->get(Collection::class);
$metadata->setQueryStringArguments(array_merge(
$metadata->getQueryStringArguments(),
['query' => $query]
));
Our first example is with Mezzio's own
template abstraction , and the fact that it allows
you to set default parameters .
Our second example is in our
Hypermedia Abstraction Language library , which
allows aggregating routing and query string parameters
It becomes a security issue when any of these
values is based on things such as identity or
authorizations
Stateful Services: Validation
if (! $validator->isValid($value)) {
$messages = $validator->getMessages();
}
echo implode(" ", $validator->getMessages())
Many validation libraries retain validation state,
including error messages
The problem arises when rendering forms; an
unsubmitted form will now contain error messages from
another user's submission.
Stateful Services: Auth
return $handler(
$request->withAttribute('user', $auth->getIdentity())
);
Some authentication and authorization libraries
memoize the user identity
Now unauthenticated users are treated the same as somebody who
previously authenticated.
Resolving State Issues
While these are scary, there are a number of ways to work around
them.
Decoration
class StatelessVariant implements SomeInterface
{
private $proxy;
public function __construct(SomeInterface $proxy)
{
$this->proxy = $proxy;
}
}
public function morphState($data) : void
{
throw new CannotChangeStateException();
}
If a class implements an interface
you can substitute an immutable variant
making mutation methods no-ops or
raise exceptions
Extension
class StatelessVariant extends OriginalClass
{
public function morphState($data) : void
{
throw new CannotChangeStateException();
}
}
For classes that have no associated interface
you can extend the original class
and override mutation methods , once again
making them either no-ops or raise
exceptions
Factories as Services
public function __construct(SomeClass $dependency) : void
becomes:
public function __construct(SomeClassFactory $factory) : void
and we then:
$dependency = ($this->factory)();
If you are familiar with PSR-17 , this
approach will look familiar
Instead of depending on an instance of a class, depend
on a factory that will produce the instance
Create an instance when needed
Stateful Messages
Pass stateful data to the service:
$result = $this->router->route(
$request->getUrl()->getPath()
);
or the request:
$result = $this->router->route(
$request
);
The previous approaches guarded against state
But sometimes we need state for calculations
Pass stateful values pulled from the request
Or pass the entire request instance
Write the method to return a result of calculations
without side effects
This is exactly the approach our router abstraction uses.
Pass State Via Request Attributes
public function process(
Request $request,
RequestHandler $handler
) : Response {
$result = $this->router->route($request);
return $handler->handle(
$request->withAttribute(RouteResult::class, $result)
);
}
What if a stateful calculation is needed elsewhere?
Pass it as a request attribute
Again, this is exactly what we do with our router
You can even use this pattern to aggregate data from
multiple middleware by passing a mutable instance.
Using Request Attributes
$routeResult = $request->getAttribute(RouteResult::class);
return new HtmlResponse($this->renderer->render(
'some::template',
[ 'routeResult' => $routeResult ]
));
The handler fetches the request attribute and
uses it
We have actually addressed the template parameter
issue this past week using an approach like this.
Outro: the takeaway is: design stateless services,
provide stateless variants of stateful services, calculate
stateful results via method calls, and pass stateful instances
via request attributes.
Sessions
Use mezzio-session
and its mezzio-session-cache adapter
Stateful services are the primary concern.
However, the other major concern is that the session
extension does not play well with Swoole
Fortunately, we provide a session
abstraction , and, better, a cache-based
session implementation as an alternative.
Async is not a silver bullet.
It is a very capable, very performant production tool.
Benefits
Eliminating bootstrap operations
Deferred operations
Crazy fast performance!
Practices
Abstract async -specific details.
Make container services stateless
Aggregate state in the request
Avoid the session extension