Sitemap

From Complexity to Clarity: The Power of Serverless Workflow DSL

9 min readApr 30, 2025

Welcome to the first article in our deep dive into the Serverless Workflow DSL — a powerful, platform-independent language for defining and orchestrating workflows. Now at version 1.0.0 and officially a CNCF project, this DSL offers a standardized way to build cloud-native workflows:

Today, I’ll demonstrate how this DSL transforms complex orchestration challenges into manageable, maintainable workflow definitions by eliminating significant infrastructure overhead.

The Hidden Complexity of Long-Running Business Processes

As backend developers, we’ve all been there: You start with a seemingly straightforward workflow requirement. “Just connect these three microservices together,” they said. “It’ll be easy,” they said. Three weeks later, you’re neck-deep in custom state persistence code, building your own retry mechanisms, and debugging race conditions that only appear in production.

What began as a quick sprint task has ballooned into an infrastructure nightmare. The actual business logic — the part that delivers real value — is buried under mountains of plumbing code that just makes the whole thing work reliably.

I’ve lived this pain firsthand. Every distributed workflow I’ve built has followed the same pattern: the visible 20% is the business logic everyone cares about, while the hidden 80% is all the scaffolding that keeps it from collapsing under real-world conditions.

As the iceberg visualization reveals, the business process code you write is just the visible tip, while a massive collection of critical infrastructure components lurks beneath the surface. These are elements you don’t initially expect to build, but absolutely need for a robust implementation.

I’ve lived this pain firsthand. The visible 20% is the business logic everyone cares about, while the hidden 80% is all the scaffolding that keeps it from collapsing under real-world conditions.

When implementing your own orchestration solution, you inevitably end up building:

  • State persistence layer to handle crashes and restarts
  • Execution engine to process workflow steps
  • Scheduling system for delays and timeouts
  • Error handling with retry policies
  • Versioning for updating workflows without breaking in-flight instances
  • Monitoring and observability tools
  • Admin interfaces for operations
  • Infrastructure for growing loads

I’ve seen teams spend months on this infrastructure, far exceeding the time invested in the actual business logic. And that’s the fundamental problem Serverless Workflow DSL aims to solve.

The Serverless Workflow DSL: Cutting Through the Complexity

The Serverless Workflow DSL lets you define business processes declaratively while the runtime handles the infrastructure. Instead of explaining the theory, let’s examine a practical example: employee onboarding.

This process spans weeks, involves multiple departments, and requires numerous approvals and integrations — exactly the kind of scenario where hand-written orchestration quickly becomes unmanageable.

document:
dsl: '1.0.0'
namespace: hr
name: employee-onboarding
version: '1.0.0'
title: Employee Onboarding Process
use:
authentications:
serviceAuth:
bearer:
token: "${ $secrets.apiKey }"
secrets:
- apiKey
do:
- initiateOnboarding:
set:
employee: "${ .employee }"
status: "initiated"
startDate: "${ .startDate }"
onboardingSteps: [
{ id: "hr-paperwork", status: "pending", deadline: "${ addDays(.startDate, -14) }" },
{ id: "it-setup", status: "pending", deadline: "${ addDays(.startDate, -7) }" },
{ id: "manager-training", status: "pending", deadline: "${ addDays(.startDate, -3) }" },
{ id: "facilities", status: "pending", deadline: "${ addDays(.startDate, -7) }" },
{ id: "first-day", status: "pending", deadline: "${ .startDate }" }
]

- sendWelcomeEmail:
call: http
with:
method: post
endpoint:
uri: https://notification-service.example.com/email
authentication:
use: serviceAuth
body:
template: "welcome-email"
to: "${ .employee.email }"
data:
name: "${ .employee.firstName }"
startDate: "${ formatDate(.startDate) }"

- initiateHrPaperwork:
call: http
with:
method: post
endpoint: https://hr-system.example.com/documents/initiate
body:
employeeId: "${ .employee.id }"
documentTypes: ["tax-forms", "benefits", "policies"]

- waitForHrPaperworkCompletion:
listen:
to:
one:
with:
type: "com.example.hr.documents.completed"
correlate:
employeeId:
from: "${ .data.employeeId }"
expect: "${ .employee.id }"
timeout:
after:
days: 5

- initiateItSetup:
call: http
with:
method: post
endpoint: https://it-service.example.com/provision
body:
employeeId: "${ .employee.id }"
department: "${ .employee.department }"
equipment: "${ .employee.equipment }"
access: ["email", "vpn", "domain"]
then: waitForItSetupCompletion

- waitForItSetupCompletion:
listen:
to:
one:
with:
type: "com.example.it.setup.completed"
correlate:
employeeId:
from: "${ .data.employeeId }"
expect: "${ .employee.id }"
timeout:
after:
days: 7
output:
as: "${ .onboardingSteps | map(if .id == \"it-setup\" then .status = \"completed\" else .) }"

- scheduleFacilities:
call: http
with:
method: post
endpoint: https://facilities.example.com/workspace
body:
employeeId: "${ .employee.id }"
startDate: "${ .startDate }"
location: "${ .employee.location }"
then: waitForFacilitiesConfirmation

- waitForFacilitiesConfirmation:
listen:
to:
one:
with:
type: "com.example.facilities.workspace.ready"
correlate:
employeeId:
from: "${ .data.employeeId }"
expect: "${ .employee.id }"
timeout:
after:
days: 7
output:
as: "${ .onboardingSteps | map(if .id == \"facilities\" then .status = \"completed\" else .) }"

- scheduleManagerTraining:
call: http
with:
method: post
endpoint: https://calendar-service.example.com/meeting
body:
attendees: ["${ .employee.email }", "${ .employee.manager.email }"]
subject: "New Employee Orientation"
duration: "hours:1"
preferredDays: ["${ subtractDays(.startDate, 3) }", "${ subtractDays(.startDate, 2) }"]
then: waitForTrainingScheduled

- waitForTrainingScheduled:
listen:
to:
one:
with:
type: "com.example.calendar.meeting.scheduled"
correlate:
subject:
from: "${ .data.subject }"
expect: "New Employee Orientation"
attendee:
from: "${ .data.attendees[0] }"
expect: "${ .employee.email }"
timeout:
after:
days: 2
output:
as:
onboardingSteps: "${ .onboardingSteps | map(if .id == \"manager-training\" then .status = \"completed\" else .) }"
trainingDetails: "${ .data }"

- waitForStartDate:
wait:
timestamp: "${ .startDate }"

- sendFirstDayInstructions:
call: http
with:
method: post
endpoint: https://notification-service.example.com/email
body:
template: "first-day-instructions"
to: "${ .employee.email }"
cc: ["${ .employee.manager.email }"]
data:
name: "${ .employee.firstName }"
location: "${ .employee.location }"
arrivalTime: "9:00 AM"
contactPerson: "${ .employee.manager.name }"
contactPhone: "${ .employee.manager.phone }"
output:
as: "${ .onboardingSteps | map(if .id == \"first-day\" then .status = \"completed\" else .) }"

- completeOnboarding:
set:
status: "completed"
completedAt: "${ now() }"

What DSL Runtimes Include, and that you DON’T Need to Build Yourself

Looking at the DSL definition above, you might be thinking: “That still looks like a substantial amount of code.” But the critical difference is what you’re not seeing — all the infrastructure components from the iceberg’s underwater portion that you don’t need to build yourself. When using Serverless Workflow DSL, here’s what you can eliminate from your development backlog:

1. Durability & State Management

What you don’t need to build: Complex state persistence mechanisms, crash recovery logic, and transaction management systems.

The workflow maintains its state automatically across system restarts, crashes, or infrastructure failures. If your application or server goes down in the middle of a process that spans weeks, it will pick up exactly where it left off when the system recovers. This eliminates entire categories of bugs related to state corruption during recovery processes.

2. Long-Running Process Support Infrastructure

What you don’t need to build: Timer services, scheduling systems, and resource management for idle processes.

Notice how the employee onboarding process spans weeks with multiple waiting periods? The DSL runtime handles all the complexities of managing these long-running processes:

  • Efficient resource utilization during waiting periods (no active threads waiting)
  • Proper resumption after delays
  • Timeout management across days or weeks

The DSL excels at managing processes that span hours, days, or even months, making it practical to model business processes that match their natural timeframes.

3. Event Correlation Systems

What you don’t need to build: Consumers performing the message matching logic, correlation identifier management.

Did you see how easily the workflow correlates events like document completion with the specific employee? You simply define:

correlate:
employeeId:
from: "${ .data.employeeId }"
expect: "${ .employee.id }"

Building a reliable correlation system from scratch typically takes weeks of development time alone. With the DSL, it’s just a few declarative lines.

4. Versioning Management

What you don’t need to build: Version control for in-flight processes, migration strategies, and backward compatibility layers.

When your onboarding process changes, you can deploy a new version of the workflow without disrupting employees already mid-way through the process. Old instances continue with the version they started with, while new instances use the improved process — all handled automatically by the runtime. This allows for gradual rollouts, A/B testing, and supporting legacy processes while introducing improvements.

5. Observability Infrastructure

What you don’t need to build: Custom logging frameworks, monitoring agents, and tracking systems.

While DSL runtimes have different maturity level on that matter, the specs require mandatory Lifecycle Events that provide ground foundation for rich observability capabilities out of the box:

  • Real-time status tracking for each employee in the onboarding process
  • Duration metrics for each step
  • Detailed execution logs
  • Visual flow representations
  • Performance metrics without writing additional monitoring code

This will make troubleshooting and performance optimization dramatically simpler than in traditional approaches where observability must be designed and implemented as a separate concern.

6. Scalable Architecture

What you don’t need to think about: Distributed execution engines, partition management systems, or complex queue-based scaling mechanisms.

Workflow runtimes are architected with scalability as a core design principle, not an afterthought. They typically implement partition-based execution models, or load balancing, and instance distribution — scaling horizontally as your workflow volume grows without requiring you to redesign your system.

Real-World Impact: Beyond Just Another Tool

After showing you the dramatic difference between custom code and the Serverless Workflow DSL, I want to share why this matters in the real world. The example we walked through only begins to reveal the profound impact this approach can have on your development process.

Time to Value: Weeks Instead of Months

I worked with a financial services team that had invested four months in a custom payment processing system. Their codebase included thousands of lines dedicated to infrastructure concerns: retries, durability, state management, and observability. This infrastructure required constant maintenance, and when new regulatory requirements emerged, implementation typically consumed another month of engineering time.

After analyzing their system, we estimated that using Serverless Workflow DSL would have allowed them to build the same process in just two weeks by orchestrating their existing internal APIs. The built-in capabilities for state persistence and observability eliminates entire categories of custom infrastructure code they are maintaining. The most compelling insight came when realizing than a regulatory changes could be adapted in a single afternoon by modifying the workflow definition.

DevOps That Actually Works

I still remember the frustration of a DevOps engineer at an e-commerce company I consulted for. Their event-driven microservices architecture was a nightmare to deploy. Each deployment risked interrupting in-flight customer orders, leading to data inconsistency or lost transactions.

With Serverless Workflow DSL, their deployment anxiety disappeared. The workflow definitions lived in version control alongside their application code, but with a critical difference: deploying a new workflow version didn’t affect orders already in progress. The versioning capabilities meant old instances completed using the version they started with, while new orders used the improved process flow. The result? Deployment frequency increased from monthly to daily without a single broken order.

Standards Make All the Difference

“Our internal workflow tool is becoming a monster,” admitted the CTO of a healthcare provider. They had built their own workflow engine with a custom UI for clinical staff to define patient journeys. As requirements grew, their proprietary system became increasingly complex and difficult to maintain. Every new feature required specialized knowledge of their custom DSL.

By adopting the standardized Serverless Workflow DSL, they gained access to a growing ecosystem of tools, documentation, and best practices. Their developers could find answers within the community instead of waiting for the one person who understood the custom system. Most importantly, they could focus on delivering healthcare features rather than maintaining workflow infrastructure.

Considerations Before Adoption

While the benefits are compelling, it’s important to understand the current state of the Serverless Workflow ecosystem:

Emerging Standard: The Serverless Workflow specification, while comprehensive, is still relatively new. It was accepted as a Cloud Native Computing Foundation (CNCF) sandbox project in 2020 and continues to evolve. This means potential for changes as the specification matures — despite having recently reached the 1.0 milestone.

Limited Runtime Options: Unlike more established workflow technologies, the number of production-ready runtimes implementing the full specification remains limited. Lemline, which I’m developing, joins a small set of options (Synapse, Apache KIE SonataFlow). Each runtime has different levels of specification compliance and performance characteristics worth evaluating.

Learning Investment: While DSL is designed to be approachable, teams should expect some learning curve. Developers comfortable with imperative programming need time to adapt to the declarative approach for workflow definition. The time invested pays off quickly, but it’s a transition that requires deliberate effort.

Despite these considerations, the productivity gains and maintenance benefits typically outweigh the adoption challenges for teams dealing with complex orchestration problems

What’s Your Orchestration Challenge?

Every organization faces unique challenges when it comes to service orchestration and workflow automation. What’s your toughest orchestration challenge? Share in the comments below, and let’s discuss how the Serverless Workflow DSL might help solve it!

--

--

Gilles Barbier
Gilles Barbier

Written by Gilles Barbier

Pioneering Workflow Orchestration with Lemline | Infinitic | Distributed Systems Architect & Consultant | AI & Microservices Workflow Solutions

Responses (1)