Running cronjobs via an Openswoole timer
Sites I build often utilize cronjobs to periodically pull in data from other sources. For example, I might want to poll an API once a day, or scrape content from another website once a month. Cronjobs are a perfect fit for this.
However, cron has a few problems:
- If the job is writing information into the file tree of your web application, you need to ensure permissions are correct, both at the filesystem level, and when writing the cronjob (e.g., running it as the same user, or changing permissions on completion).
- If you are running console tooling associated with your PHP application, you may need to worry about whether or not particular environment variables are in scope when you run the job.
- In containerized environments, usage of cron is strongly discouraged, as it means running another daemon. You can get around this with tools such as the s6-overlay, but it's another vector for issues.
Since most sites I build anymore use mezzio-swoole, I started wondering if I might be able to handle these jobs another way.
Task workers
We introduced integration with Swoole's task workers in version 2 of mezzio-swoole. Task workers run as a separate pool from web workers, and allow web workers to offload heavy processing when the results are not needed for the current request. They act as a form of per-server message queue, and are great for doing things such as sending emails, processing webhook payloads, and more.
The integration in mezzio-swoole allows you to decorate PSR-14 EventDispatcher listeners in mezzio-swoole Mezzio\Swoole\Task\DeferredListener
or DeferredServiceListener
instances; when that happens, the decorator creates a task with the Swoole server, giving it the actual listener and the event.
When the schedule process the task, it then calls the listener with the event.
The upshot is that to create a task, you just dispatch an event from your code. Your code is thus agnostic about the fact that it's being handled asynchronously.
However, because tasks work in a separate pool, this means that the event instances they receive are technically copies and not references; as such, your application code cannot expect the listener to communicate event state back to you. If you choose to use this feature, only use it for fire-and-forget events.
I bring all this up now because I'm going to circle back to it in a bit.
Scheduling jobs
Swoole's answer to scheduling jobs is its timer.
With a timer, you can tick: invoke functionality each time a period has elapsed.
Timers operate within event loops, which means every server type that Swoole exposes has a tick()
method, including the HTTP server.
The obvious answer, then, is to register a tick:
// Intervals are measured in milliseconds.
// The following means "every 3 hours".
$server->tick(1000 * 60 * 60 * 3, $callback);
Now I hit the problems:
- How do I get access to the server instance?
- What can I specify as a callback, and how do I get it?
With mezzio-swoole, the time to register this is when the HTTP server starts. Since Swoole only allows one listener per event, mezzio-swoole composes a PSR-14 event dispatcher, and registers with each Swoole HTTP server event. The listeners then trigger events via the PSR-14 event dispatcher, using custom event types internally that provide access to the data originally passed to the Swoole server events. This approach allows the application developer to attach listeners to events and modify how the application works.
To allow these "workflow" events to be separate from the application if desired, we register a Mezzio\Swoole\Event\EventDispatcherInterface
service that returns a discrete PSR-14 event dispatcher implementation.
I generally alias this to the PSR-14 interface, so I can use the same instance for application events.
I use my own phly/phly-event-dispatcher implementation, which provides a number of different listener providers.
The easiest one is Phly\EventDispatcher\AttachableListenerProvider
, which defines a single listen()
method for attaching a listener to a given event class.
On top of that, Mezzio and Laminas have a concept of delegator factories. These allow you to "decorate" the creation of a service. One us
Truncated by Planet PHP, read more at the original (another 14608 bytes)