Under the Hood of an Event-Driven “Workflow As Code” Engine

Example of a simple workflow that describes and orchestrates 3 tasks sequentially, each of which can be processed in distributed servers
  • download an image from a url
  • resize this image
  • upload the resized image to a server

Sequential Workflow Example

This workflow can be processed using an event-driven architecture as follows:

  • A client dispatches aRunWorkflowevent (with parameters describing which workflow and which input parameters)
  • A WorkflowEngine service catches this event and runs the workflow code CropInamge.handle(imageUrl, size). During this execution, the ImageUtil proxy stumbles upon a call of its method download(imageUrl). As this task is not known yet, the processing of the workflow code is stopped here and a RunTask event (describing this task) is dispatched by the workflow engine.
  • An ImageUtil service catches this event, runs the task code ImageUtil.download(imageUrl), and returns the serialized result within aTaskCompleted event
  • A WorkflowEngine service catches this event and runs the the workflow codeCropImage.handle(imageUrl, size) again. During this execution, the ImageUtil proxy stumbles upon a call of its method download(imageUrl). This time, this task is known from the workflow history, so the proxy can return its output (after deserialization) and the processing of the workflow code can continue. Then the ImageUtil proxy stumbles upon a call of its method resize(image, size). As this task is not known yet, the processing of the workflow code is stopped here and a RunTask event (describing this new task) is dispatched by the workflow engine.
  • And so on… up to the workflow completion.
  • The workflow process is not owned by a long-running thread running somewhere. It’s just composed of the history of exchanged messages;
  • To be consistent, a workflow implementation needs to produce the same result when executed multiple times, so it excludes the use of any non-deterministic functions (such as random()or now()) or any multithreading that would potentially modify command orders;
  • The workers processing tasks are stateless, only the engine needs to store data related to the workflow’s history;
  • Exchanged data are serialized/deserialized (using Avro in Infinitic case);
  • Tasks can run in a different programming language than the one used for workflows.

WorkflowTasks

The engine above has 2 different roles:

  1. Receiving completion events from tasks processors, updating the workflow’s history accordingly and triggering tasks
  2. Running the workflow code to decide what to do next, based on current workflow history

Parallel Processing

The sequential example is actually a bit too easy. Things are becoming significantly more complex when a workflow contains parallel processings, which occurs — for example — with asynchronous tasks or asynchronous branches. In such a situation, a TaskCompleted event could trigger a workflowTask while another one is running. This case is illustrated below:

Incorrect implementation of a workflow engine triggering parallel workflowTasks
Correct behavior of a workflow engine making sure than workflowTasks are processed sequentially

Error Management

What happens when a task (or a workflowTask) fails? To make sure that each task is actually processed and managed properly, Infinitic adds an additional layer in charge of task management: instead of sending a task directly to the workers, the Workflow Engine will send it to a Task Engine in charge of guaranteeing that each task is managed up to its completion or its cancellation.

  • retry failed task based on a pre-defined retry strategy
  • handles manual retry request
  • manages timeouts
Each Task / WorkflowTask is handled by a TaskEngine up to its completion or its cancellation

Conclusions

This article describes the general concepts and constraints behind a “workflow as code” event-driven workflow engine such as Infinitic, which is new pattern to orchestrate distributed tasks at scale. A lot of details are still to be described, like how to manage workflow properties. If you are interested to follow this development, please subscribe here, and do not hesitate to comment below :)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gilles Barbier

Gilles Barbier

Making distributed systems and workflows easy at https://infinitic.io. Previously founder at Zenaton and director at The Family — proud dad