← All research
Data Orchestration

Orchestration Is Not Just Scheduling

The mental model most engineers carry for workflow orchestration — "it's cron at scale" — is twenty years out of date. It was accurate for the generation of tools that preceded Airflow: tools that ran jobs on a schedule and flagged failures. It is not accurate for what production data and ML teams actually need from orchestration today.

The core insight that Prefect and the next generation of orchestration tools are built around is that production pipelines fail. Not occasionally. Regularly. External APIs rate-limit. Source databases have schema drift. Upstream jobs take longer than expected and violate timing assumptions. Models fail to load in serving environments because a dependency changed. A pipeline that doesn't assume failure as the default is a pipeline that requires manual intervention every week.

The three components of real orchestration

The first component is observability — the ability to understand what happened when a run finishes, succeeds, or fails. This is not a logging problem; it's a structured state problem. You need to be able to answer questions like: which tasks ran, which tasks were skipped, which failed, what did the failure look like, what inputs did the failing task receive, what was the state of the upstream data at failure time. Classical log-based observability answers some of these questions; task-level state management answers all of them.

The second component is recoverable failure. When a pipeline fails partway through, you need to be able to resume from the point of failure rather than rerun from scratch. This requires the orchestration system to maintain granular state about task completion, and to make task resumption semantically correct — which is harder than it sounds when tasks have side effects (writes to databases, calls to external APIs, model artifact creation).

The third component is dynamic task graphs. Real pipelines are not static DAGs that you define once and run indefinitely. The number of tasks, their inputs, and their dependencies change based on data conditions. A pipeline that processes variable numbers of input files, that spawns model training jobs based on data availability, that conditionally routes results to different sinks based on model output — all of these require a task graph that can be parameterized or constructed dynamically at runtime. Static DAG definitions can't express this without contortions.

Why Flintrock backed Prefect

We led Prefect's Seed in 2022 for two reasons that both come back to the "failure as default" design philosophy. First, Prefect's architecture made task-level state management a first-class primitive from the beginning — it wasn't retrofitted onto a scheduler. Second, the Prefect team had lived with production pipeline failures at Airflow scale and designed the API ergonomics around the debugging and recovery workflows that actually matter, not around the happy path.

The Python-native interface also mattered. Pipelines defined as Python functions rather than YAML configurations can be tested with standard Python testing tools, composed with standard Python abstractions, and debugged with standard Python debugging tools. This doesn't sound important until you've spent time debugging a complex Airflow DAG at two in the morning and realized that the gap between "this is a Python function" and "this is a YAML-configured operator" is enormous from a cognitive-load perspective.

Where orchestration is going

The next frontier for orchestration is LLM-native workflows — pipelines where some tasks are model calls, where task inputs and outputs are unstructured text, and where the branching logic depends on model output rather than structured data conditions. This requires a different design than data orchestration: you need to track not just whether a task succeeded but what the model said, you need to handle retries for probabilistic failures differently than deterministic ones, and you need to version prompt configurations alongside code. The orchestration tools that nail this surface for AI application development will have a large market.