Workflows
This guide provides a comprehensive overview of Temporal Workflows.
In day-to-day conversations, the term Workflow frequently denotes either a Workflow TypeWhat is a Workflow Type?
A Workflow Type is a name that maps to a Workflow Definition.
Learn more, a Workflow DefinitionWhat is a Workflow Definition?
A Workflow Definition is the code that defines the constraints of a Workflow Execution.
Learn more, or a Workflow Execution.
Temporal documentation aims to be explicit and differentiate between them.
Workflow Definition
A Workflow Definition is the code that defines the constraints of a Workflow Execution.
- How to develop a Workflow DefinitionHow to develop a basic Workflow
Workflows are the fundamental unit of a Temporal Application, and it all starts with the development of a Workflow Definition.
Learn more
A Workflow Definition is often also referred to as a Workflow Function. In Temporal's documentation, a Workflow Definition refers to the source for the instance of a Workflow Execution, while a Workflow Function refers to the source for the instance of a Workflow Function Execution.
A Workflow Execution effectively executes once to completion, while a Workflow Function Execution occurs many times during the life of a Workflow Execution.
We strongly recommend that you write a Workflow Definition in a language that has a corresponding Temporal SDK.
Deterministic constraints
A critical aspect of developing Workflow Definitions is ensuring they exhibit certain deterministic traits – that is, making sure that the same Commands are emitted in the same sequence, whenever a corresponding Workflow Function Execution (instance of the Function Definition) is re-executed.
The execution semantics of a Workflow Execution include the re-execution of a Workflow Function, which is called a Replay.
The use of Workflow APIs in the function is what generates CommandsWhat is a Command?
A Command is a requested action issued by a Worker to the Temporal Cluster after a Workflow Task Execution completes.
Learn more.
Commands tell the Cluster which Events to create and add to the Workflow Execution's Event History.
When a Workflow Function executes, the Commands that are emitted are compared with the existing Event History.
If a corresponding Event already exists within the Event History that maps to the generation of that Command in the same sequence, and some specific metadata of that Command matches with some specific metadata of the Event, then the Function Execution progresses.
For example, using an SDK's "Execute Activity" API generates the ScheduleActivityTask Command. When this API is called upon re-execution, that Command is compared with the Event that is in the same location within the sequence. The Event in the sequence must be an ActivityTaskScheduled Event, where the Activity name is the same as what is in the Command.
If a generated Command doesn't match what it needs to in the existing Event History, then the Workflow Execution returns a non-deterministic error.
The following are the two reasons why a Command might be generated out of sequence or the wrong Command might be generated altogether:
- Code changes are made to a Workflow Definition that is in use by a running Workflow Execution.
- There is intrinsic non-deterministic logic (such as inline random branching).
Code changes can cause non-deterministic behavior
The Workflow Definition can change in very limited ways once there is a Workflow Execution depending on it. To alleviate non-deterministic issues that arise from code changes, we recommend using Workflow Versioning.
For example, let's say we have a Workflow Definition that defines the following sequence:
- Start and wait on a Timer/sleep.
- Spawn and wait on an Activity Execution.
- Complete.
We start a Worker and spawn a Workflow Execution that uses that Workflow Definition. The Worker would emit the StartTimer Command and the Workflow Execution would become suspended.
Before the Timer is up, we change the Workflow Definition to the following sequence:
- Spawn and wait on an Activity Execution.
- Start and wait on a Timer/sleep.
- Complete.
When the Timer fires, the next Workflow Task will cause the Workflow Function to re-execute. The first Command the Worker sees would be ScheduleActivityTask Command, which wouldn't match up to the expected TimerStarted Event.
The Workflow Execution would fail and return a non-deterministic error.
The following are examples of minor changes that would not result in non-determinism errors when re-executing a History which already contain the Events:
- Changing the duration of a Timer (unless changing from a duration of 0).
- Changing the arguments to:
- The Activity Options in a call to spawn an Activity Execution (local or nonlocal).
- The Child Workflow Options in a call to spawn a Child Workflow Execution.
- Call to Signal an External Workflow Execution.
- Adding a Signal Handler for a Signal Type that has not been sent to this Workflow Execution.
Intrinsic non-deterministic logic
Intrinsic non-determinism is when a Workflow Function Execution might emit a different sequence of Commands on re-execution, regardless of whether all the input parameters are the same.
For example, a Workflow Definition can not have inline logic that branches (emits a different Command sequence) based off a local time setting or a random number.
In the representative pseudocode below, the local_clock()
function returns the local time, rather than Temporal-defined time:
fn your_workflow() {
if local_clock().is_before("12pm") {
await workflow.sleep(duration_until("12pm"))
} else {
await your_afternoon_activity()
}
}
Each Temporal SDK offers APIs that enable Workflow Definitions to have logic that gets and uses time, random numbers, and data from unreliable resources. When those APIs are used, the results are stored as part of the Event History, which means that a re-executed Workflow Function will issue the same sequence of Commands, even if there is branching involved.
In other words, all operations that do not purely mutate the Workflow Execution's state should occur through a Temporal SDK API.
Workflow Versioning
The Workflow Versioning feature enables the creation of logical branching inside a Workflow Definition based on a developer specified version identifier. This feature is useful for Workflow Definition logic needs to be updated, but there are running Workflow Executions that currently depends on it. It is important to note that a practical way to handle different versions of Workflow Definitions, without using the versioning API, is to run the different versions on separate Task Queues.
- How to version Workflow Definitions in Go
- How to version Workflow Definitions in Java
- How to version Workflow Definitions in TypeScript
Handling unreliable Worker Processes
You do not handle Worker Process failure or restarts in a Workflow Definition.
Workflow Function Executions are completely oblivious to the Worker Process in terms of failures or downtime. The Temporal Platform ensures that the state of a Workflow Execution is recovered and progress resumes if there is an outage of either Worker Processes or the Temporal Cluster itself. The only reason a Workflow Execution might fail is due to the code throwing an error or exception, not because of underlying infrastructure outages.
Workflow Type
A Workflow Type is a name that maps to a Workflow Definition.
- A single Workflow Type can be instantiated as multiple Workflow Executions.
- A Workflow Type is scoped by a Task Queue. It is acceptable to have the same Workflow Type name map to different Workflow Definitions if they are using completely different Workers.
Workflow Type cardinality with Workflow Definitions and Workflow Executions
Workflow Execution
A Temporal Workflow Execution is a durable, reliable, and scalable function execution.
It is the main unit of execution of a Temporal ApplicationWhat is a Temporal Application
A Temporal Application is a set of Workflow Executions.
Learn more.
Each Temporal Workflow Execution has exclusive access to its local state.
It executes concurrently to all other Workflow Executions, and communicates with other Workflow Executions through SignalsWhat is a Signal?
A Signal is an asynchronous request to a Workflow Execution.
Learn more and the environment through ActivitiesWhat is an Activity?
In day-to-day conversations, the term "Activity" frequently denotes either an Activity Type, an Activity Definition, or an Activity Execution.
Learn more.
While a single Workflow Execution has limits on size and throughput, a Temporal Application can consist of millions to billions of Workflow Executions.
Durability
Durability is the absence of an imposed time limit.
A Workflow Execution is durable because it executes a Temporal Workflow Definition (also called a Temporal Workflow Function), your application code, effectively once and to completion—whether your code executes for seconds or years.
Reliability
Reliability is responsiveness in the presence of failure.
A Workflow Execution is reliable, because it is fully recoverable after a failure. The Temporal Platform ensures the state of the Workflow Execution persists in the face of failures and outages and resumes execution from the latest state.
Scalability
Scalability is responsiveness in the presence of load.
A single Workflow Execution is limited in size and throughput but is scalable because it can Continue-As-NewWhat is Continue-As-New?
Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
Learn more in response to load.
A Temporal Application is scalable because the Temporal Platform is capable of supporting millions to billions of Workflow Executions executing concurrently, which is realized by the design and nature of the Temporal ClusterWhat is a Temporal Cluster?
A Temporal Cluster is the Temporal Server paired with persistence.
Learn more and Worker ProcessesWhat is a Worker Process?
A Worker Process is responsible for polling a Task Queue, dequeueing a Task, executing your code in response to a Task, and responding to the Temporal Server with the results.
Learn more.
Replays
A Replay is the method by which a Workflow Execution resumes making progress. During a Replay the Commands that are generated are checked against an existing Event History. Replays are necessary and often happen to give the effect that Workflow Executions are resumable, reliable, and durable.
For more information, see Deterministic constraintsWhat is a Workflow Definition?
A Workflow Definition is the code that defines the constraints of a Workflow Execution.
Learn more.
If a failure occurs, the Workflow Execution picks up where the last recorded event occurred in the Event History.
- How to use Replay APIs to test Workflow DefinitionsHow to Replay a Workflow Execution
Replay recreates the exact state of a Workflow Execution.
Learn more
Commands and awaitables
A Workflow Execution does two things:
- Issue CommandsWhat is a Command?
A Command is a requested action issued by a Worker to the Temporal Cluster after a Workflow Task Execution completes.
Learn more. - Wait on an Awaitables (often called Futures).
Command generation and waiting
Commands are issued and Awaitables are provided by the use of Workflow APIs in the Workflow DefinitionWhat is a Workflow Definition?
A Workflow Definition is the code that defines the constraints of a Workflow Execution.
Learn more.
Commands are generated whenever the Workflow Function is executed.
The Worker Process supervises the Command generation and makes sure that it maps to the current Event History.
(For more information, see Deterministic constraintsWhat is a Workflow Definition?
A Workflow Definition is the code that defines the constraints of a Workflow Execution.
Learn more.)
The Worker Process batches the Commands and then suspends progress to send the Commands to the Cluster whenever the Workflow Function reaches a place where it can no longer progress without a result from an Awaitable.
A Workflow Execution may only ever block progress on an Awaitable that is provided through a Temporal SDK API. Awaitables are provided when using APIs for the following:
- Awaiting: Progress can block using explicit "Await" APIs.
- Requesting cancellation of another Workflow Execution: Progress can block on confirmation that the other Workflow Execution is cancelled.
- Sending a SignalWhat is a Signal?
A Signal is an asynchronous request to a Workflow Execution.
Learn more: Progress can block on confirmation that the Signal sent. - Spawning a Child Workflow ExecutionWhat is a Child Workflow Execution?
A Child Workflow Execution is a Workflow Execution that is spawned from within another Workflow.
Learn more: Progress can block on confirmation that the Child Workflow Execution started, and on the result of the Child Workflow Execution. - Spawning an Activity ExecutionWhat is an Activity Execution?
An Activity Execution is the full chain of Activity Task Executions.
Learn more: Progress can block on the result of the Activity Execution. - Starting a Timer: Progress can block until the Timer fires.
Status
A Workflow Execution can be either Open or Closed.
Workflow Execution statuses
Open
- Running: The only Open status for a Workflow Execution. When the Workflow Execution is Running, it is either actively progressing or is waiting on something.
Closed
A Closed status means that the Workflow Execution cannot make further progress because of one of the following reasons:
- Cancelled: The Workflow Execution successfully handled a cancellation request.
- Completed: The Workflow Execution has completed successfully.
- Continued-As-New: The Workflow Execution Continued-As-NewWhat is Continue-As-New?
Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
Learn more. - Failed: The Workflow Execution returned an error and failed.
- Terminated: The Workflow Execution was terminated.
- Timed Out: The Workflow Execution reached a timeout limit.
Workflow Execution Chain
A Workflow Execution Chain is a sequence of Workflow Executions that share the same Workflow Id. Each link in the Chain is often called a Workflow Run. Each Workflow Run in the sequence is connected by one of the following:
- Continue-As-NewWhat is Continue-As-New?
Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
Learn more - RetriesWhat is a Retry Policy?
A Retry Policy is a collection of attributes that instructs the Temporal Server how to retry a failure of a Workflow Execution or an Activity Task Execution.
Learn more - Temporal Cron JobWhat is a Temporal Cron Job?
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Learn more
A Workflow Execution is uniquely identified by its NamespaceWhat is a Namespace?
A Namespace is a unit of isolation within the Temporal Platform
Learn more, Workflow IdWhat is a Workflow Id?
A Workflow Id is a customizable, application-level identifier for a Workflow Execution that is unique to an Open Workflow Execution within a Namespace.
Learn more, and Run IdWhat is a Run Id?
A Run Id is a globally unique, platform-level identifier for a Workflow Execution.
Learn more.
The Workflow Execution TimeoutWhat is a Workflow Execution Timeout?
A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
Learn more applies to a Workflow Execution Chain.
The Workflow Run TimeoutWhat is a Workflow Run Timeout?
This is the maximum amount of time that a single Workflow Run is restricted to.
Learn more applies to a single Workflow Execution (Workflow Run).
Event loop
A Workflow Execution is made up of a sequence of EventsWhat is an Event?
Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
Learn more called an Event HistoryWhat is an Event History?
An append log of Events that represents the full state a Workflow Execution.
Learn more.
Events are created by the Temporal Cluster in response to either Commands or actions requested by a Temporal Client (such as a request to spawn a Workflow Execution).
Workflow Execution
Time constraints
Is there a limit to how long Workflows can run?
No, there is no time constraint on how long a Workflow Execution can be Running.
However, Workflow Executions intended to run indefinitely should be written with some care. The Temporal Cluster stores the complete Event History for the entire lifecycle of a Workflow Execution. There is a hard limit of 50,000 Events in a Workflow Execution Event History, as well as a hard limit of 50 MB in terms of size. The Temporal Cluster logs a warning at every 10,000 Events. When the Event History reaches 50,000 Events or the size limit of 50 MB, the Workflow Execution is forcefully terminated.
To prevent runaway Workflow Executions, you can use the Workflow Execution Timeout, the Workflow Run Timeout, or both. A Workflow Execution Timeout can be used to limit the duration of Workflow Execution Chain, and a Workflow Run Timeout can be used to limit the duration an individual Workflow Execution (Run).
You can use the Continue-As-NewWhat is Continue-As-New?
Continue-As-New is the mechanism by which all relevant state is passed to a new Workflow Execution with a fresh Event History.
Learn more feature to close the current Workflow Execution and create a new Workflow Execution in a single atomic operation.
The Workflow Execution spawned from Continue-As-New has the same Workflow Id, a new Run Id, and a fresh Event History and is passed all the appropriate parameters.
For example, it may be reasonable to use Continue-As-New once per day for a long-running Workflow Execution that is generating a large Event History.
Limits
Each pending Activity generates a metadata entry in the Workflow's mutable state. Too many entries create a large mutable state, which causes unstable persistence.
To protect the system, Temporal enforces a maximum of 50,000 pending Activities, Child Workflows, external Workflows, and Signals. These limits are set with the following dynamic configuration keys:
NumPendingChildExecutionsLimit
NumPendingActivitiesLimit
NumPendingSignals
NumPendingCancelRequestsLimit
By default, Temporal fails Workflow Task Executions that would cause the Workflow to surpass 50,000 pending Activities, Child Workflows, external Workflows, or Signals.
Similar constraints are enforced for SignalExternalWorkflowExecution
, RequestCancelExternalWorkflowExecution
, and StartChildWorkflowExecution
Commands.
Cloud users are limited to 2,000 each of pending Activities, Child Workflows, external Workflows, and Signals.
Command
A Command is a requested action issued by a WorkerWhat is a Worker?
In day-to-day conversations, the term Worker is used to denote both a Worker Program and a Worker Process. Temporal documentation aims to be explicit and differentiate between them.
Learn more to the Temporal ClusterWhat is a Temporal Cluster?
A Temporal Cluster is the Temporal Server paired with persistence.
Learn more after a Workflow Task ExecutionWhat is a Workflow Task Execution?
A Workflow Task Execution occurs when a Worker picks up a Workflow Task and uses it to make progress on the execution of a Workflow Definition.
Learn more completes.
The action that the Cluster takes is recorded in the Workflow Execution's Event HistoryWhat is an Event History?
An append log of Events that represents the full state a Workflow Execution.
Learn more as an EventWhat is an Event?
Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
Learn more.
The Workflow Execution can await on some of the Events that come as a result from some of the Commands.
Commands are generated by the use of Workflow APIs in your code. During a Workflow Task Execution there may be several Commands that are generated. The Commands are batched and sent to the Cluster as part of the Workflow Task Execution completion request, after the Workflow Task has progressed as far as it can with the Workflow function. There will always be WorkflowTaskStarted and WorkflowTaskCompleted Events in the Event History when there is a Workflow Task Execution completion request.
Commands are generated by the use of Workflow APIs in your code
Commands are described in the Command reference and are defined in the Temporal gRPC API.
Event
Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution. Each Event corresponds to an enum
that is defined in the Server API.
All Events are recorded in the Event HistoryWhat is an Event History?
An append log of Events that represents the full state a Workflow Execution.
Learn more.
A list of all possible Events that could appear in a Workflow Execution Event History is provided in the Event reference.
Event History
An append-log of EventsWhat is an Event?
Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
Learn more for your application.
- Event History is durably persisted by the Temporal service, enabling seamless recovery of your application state from crashes or failures.
- It also serves as an audit log for debugging.
Event History limits
The Temporal Cluster stores the complete Event History for the entire lifecycle of a Workflow Execution.
A Workflow Execution Event History has a hard limit of 50,000 Events, as well as a hard limit of 50 MB in terms of size. The Temporal Cluster logs a warning at every 10,000 Events.
When the Event History reaches 50,000 Events or the size limit of 50 MB, the Workflow Execution is forcefully terminated.
Continue-As-New
Continue-As-New is a mechanism by which the latest relevant state is passed to a new Workflow Execution, with a fresh Event History.
As a precautionary measure, the Temporal Platform limits the total Event HistoryWhat is an Event History?
An append log of Events that represents the full state a Workflow Execution.
Learn more to 50,000 Events or 50 MB, and will warn you every 10,000 Events or 10 MB.
To prevent a Workflow Execution Event History from exceeding this limit and failing, use Continue-As-New to start a new Workflow Execution with a fresh Event History.
All values passed to a Workflow Execution through parameters or returned through a result value are recorded into the Event History. A Temporal Cluster stores the full Event History of a Workflow Execution for the duration of a Namespace's retention period. A Workflow Execution that periodically executes many Activities has the potential of hitting the size limit.
A very large Event History can adversely affect the performance of a Workflow Execution. For example, in the case of a Workflow Worker failure, the full Event History must be pulled from the Temporal Cluster and given to another Worker via a Workflow Task. If the Event history is very large, it may take some time to load it.
The Continue-As-New feature enables developers to complete the current Workflow Execution and start a new one atomically.
The new Workflow Execution has the same Workflow Id, but a different Run Id, and has its own Event History.
In the case of Temporal Cron JobsWhat is a Temporal Cron Job?
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Learn more, Continue-As-New is actually used internally for the same effect.
Reset
A Reset terminates a Workflow Execution, removes the progress in the Event History up to the reset point, and then creates a new Workflow Execution with the same Workflow Type and Id to continue.
Run Id
A Run Id is a globally unique, platform-level identifier for a Workflow Execution.
Temporal guarantees that only one Workflow Execution with a given Workflow IdWhat is a Workflow Id?
A Workflow Id is a customizable, application-level identifier for a Workflow Execution that is unique to an Open Workflow Execution within a Namespace.
Learn more can be in an Open state at any given time.
But when a Workflow Execution reaches a Closed state, it is possible to have another Workflow Execution in an Open state with the same Workflow Id.
For example, a Temporal Cron Job is a chain of Workflow Executions that all have the same Workflow Id.
Each Workflow Execution within the chain is considered a Run.
A Run Id uniquely identifies a Workflow Execution even if it shares a Workflow Id with other Workflow Executions.
Don't rely on storing the current Run Id or using it for any logical choices. A Workflow Retry changes the Run Id. Because the current Run Id is mutable, relying on it might produce non-determinism issues.
Learn more
For more information, see the following links.
Workflow Id
A Workflow Id is a customizable, application-level identifier for a Workflow Execution that is unique to an Open Workflow Execution within a Namespace.
- How to set a Workflow IdHow to set a custom Workflow Id in Go
Create an instance of `StartWorkflowOptions` from the `go.temporal.io/sdk/client` package, set the `ID` field, and pass the instance to the `ExecuteWorkflow` call.
Learn more
A Workflow Id is meant to be a business-process identifier such as customer identifier or order identifier.
A Workflow Id Reuse PolicyWhat is a Workflow Id Reuse Policy?
A Workflow Id Reuse Policy determines whether a Workflow Execution is allowed to spawn with a particular Workflow Id, if that Workflow Id has been used with a previous, and now Closed, Workflow Execution.
Learn more can be used to manage whether a Workflow Id can be re-used.
The Temporal Platform guarantees uniqueness of the Workflow Id within a NamespaceWhat is a Namespace?
A Namespace is a unit of isolation within the Temporal Platform
Learn more based on the Workflow Id Reuse Policy.
It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution, regardless of the Workflow Id Reuse Policy.
An attempt to spawn a Workflow Execution with a Workflow Id that is the same as the Id of a currently Open Workflow Execution results in a Workflow execution already started
error.
A Workflow Execution can be uniquely identified across all Namespaces by its NamespaceWhat is a Namespace?
A Namespace is a unit of isolation within the Temporal Platform
Learn more, Workflow Id, and Run IdWhat is a Run Id?
A Run Id is a globally unique, platform-level identifier for a Workflow Execution.
Learn more.
Workflow Id Reuse Policy
A Workflow Id Reuse Policy determines whether a Workflow Execution is allowed to spawn with a particular Workflow Id, if that Workflow Id has been used with a previous, and now Closed, Workflow Execution.
It is not possible for a new Workflow Execution to spawn with the same Workflow Id as another Open Workflow Execution.
An attempt to spawn a Workflow Execution with a Workflow Id that is the same as the Id of a currently Open Workflow Execution results in a Workflow Execution already started
error.
A Workflow Id Reuse Policy has three possible values:
- Allow Duplicate The Workflow Execution is allowed to exist regardless of the Closed status of a previous Workflow Execution with the same Workflow Id. This is the default policy, if one is not specified. Use this when it is OK to have a Workflow Execution with the same Workflow Id as a previous, but now Closed, Workflow Execution.
- Allow Duplicate Failed Only: The Workflow Execution is allowed to exist only if a previous Workflow Execution with the same Workflow Id does not have a Completed status. Use this policy when there is a need to re-execute a Failed, Timed Out, Terminated or Cancelled Workflow Execution and guarantee that the Completed Workflow Execution will not be re-executed.
- Reject Duplicate: The Workflow Execution cannot exist if a previous Workflow Execution has the same Workflow Id, regardless of the Closed status. Use this when there can only be one Workflow Execution per Workflow Id within a Namespace for the given retention period.
A Workflow Id Reuse Policy applies only if a Closed Workflow Execution with the same Workflow Id exists within the Retention Period of the associated Namespace. For example, if the Namespace's retention period is 30 days, a Workflow Id Reuse Policy can only compare the Workflow Id of the spawning Workflow Execution against the Closed Workflow Executions for the last 30 days.
If there is an attempt to spawn a Workflow Execution with a Workflow Id Reuse Policy that won't allow it the Server will prevent the Workflow Execution from spawning.
Workflow Execution Timeout
A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
- How to set a Workflow Execution TimeoutWorkflow retries
A Retry Policy can work in cooperation with the timeouts to provide fine controls to optimize the execution experience.
Learn more
Workflow Execution Timeout period
The default value is ∞ (infinite).
If this timeout is reached, the Workflow Execution changes to a Timed Out status.
This timeout is different from the Workflow Run TimeoutWhat is a Workflow Run Timeout?
This is the maximum amount of time that a single Workflow Run is restricted to.
Learn more.
This timeout is most commonly used for stopping the execution of a Temporal Cron JobWhat is a Temporal Cron Job?
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Learn more after a certain amount of time has passed.
Workflow Run Timeout
A Workflow Run Timeout is the maximum amount of time that a single Workflow Run is restricted to.
- How to set a Workflow Run TimeoutWorkflow retries
A Retry Policy can work in cooperation with the timeouts to provide fine controls to optimize the execution experience.
Learn more
Workflow Run Timeout period
The default is set to the same value as the Workflow Execution TimeoutWhat is a Workflow Execution Timeout?
A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
Learn more.
This timeout is most commonly used to limit the execution time of a single Temporal Cron Job ExecutionWhat is a Temporal Cron Job?
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Learn more.
If the Workflow Run Timeout is reached, the Workflow Execution is Terminated.
Workflow Task Timeout
A Workflow Task Timeout is the maximum amount of time allowed for a WorkerWhat is a Worker?
In day-to-day conversations, the term Worker is used to denote both a Worker Program and a Worker Process. Temporal documentation aims to be explicit and differentiate between them.
Learn more to execute a Workflow TaskWhat is a Workflow Task?
A Workflow Task is a Task that contains the context needed to make progress with a Workflow Execution.
Learn more after the Worker has pulled that Workflow Task from the Task QueueWhat is a Task Queue?
A Task Queue is a first-in, first-out queue that a Worker Process polls for Tasks.
Learn more.
Workflow Task Timeout period
The default value is 10 seconds. This timeout is primarily available to recognize whether a Worker has gone down so that the Workflow Execution can be recovered on a different Worker. The main reason for increasing the default value would be to accommodate a Workflow Execution that has a very long Workflow Execution History that could take longer than 10 seconds for the Worker to load.
Implementation guides:
- How to set a Workflow Task TimeoutWorkflow retries
A Retry Policy can work in cooperation with the timeouts to provide fine controls to optimize the execution experience.
Learn more
Memo
A Memo is a non-indexed user-supplied set of Workflow Execution metadata that is displayed with Filtered List results.
Signal
A Signal is an asynchronous request to a Workflow Execution.
A Signal delivers data to a running Workflow Execution. It cannot return data to the caller; to do so, use a Query instead. The Workflow code that handles a Signal can mutate Workflow state. A Signal can be sent from a Temporal Client or a Workflow. When a Signal is sent, it is received by the Cluster and recorded as an Event to the Workflow Execution Event History. A successful response from the Cluster means that the Signal has been persisted and will be delivered at least once to the Workflow Execution.1 The next scheduled Workflow Task will contain the Signal Event.
A Signal must include a destination (Namespace and Workflow Id) and name. It can include a list of arguments.
Signal handlers are Workflow functions that listen for Signals by the Signal name. Signals are delivered in the order they are received by the Cluster. If multiple deliveries of a Signal would be a problem for your Workflow, add idempotency logic to your Signal handler that checks for duplicates.
Query
A Query is a synchronous operation that is used to get the state of a Workflow Execution. The state of a running Workflow Execution is constantly changing. You can use Queries to expose the internal Workflow Execution state to the external world. Queries are available for running or completed Workflows Executions only if the Worker is up and listening on the Task Queue.
Queries are sent from a Temporal Client to a Workflow Execution. The API call is synchronous. The Query is identified at both ends by a Query name. The Workflow must have a Query handler that is developed to handle that Query and provide data that represents the state of the Workflow Execution.
Queries are strongly consistent and are guaranteed to return the most recent state. This means that the data reflects the state of all confirmed Events that came in before the Query was sent. An Event is considered confirmed if the call creating the Event returned success. Events that are created while the Query is outstanding may or may not be reflected in the Workflow state the Query result is based on.
A Query can carry arguments to specify the data it is requesting. And each Workflow can expose data to multiple types of Queries.
A Query must never mutate the state of the Workflow Execution—that is, Queries are read-only and cannot contain any blocking code. This means, for example, that Query handling logic cannot schedule Activity Executions.
Sending Queries to completed Workflow Executions is supported, though Query reject conditions can be configured per Query.
Stack Trace Query
In many SDKs, the Temporal Client exposes a predefined __stack_trace
Query that returns the stack trace of all the threads owned by that Workflow Execution.
This is a great way to troubleshoot a Workflow Execution in production.
For example, if a Workflow Execution has been stuck at a state for longer than an expected period of time, you can send a __stack_trace
Query to return the current call stack.
The __stack_trace
Query name does not require special handling in your Workflow code.
Stack Trace Queries are available only for running Workflow Executions.
Side Effect
A Side Effect is a way to execute a short, non-deterministic code snippet, such as generating a UUID, that executes the provided function once and records its result into the Workflow Execution Event History.
A Side Effect does not re-execute upon replay, but instead returns the recorded result.
Do not ever have a Side Effect that could fail, because failure could result in the Side Effect function executing more than once. If there is any chance that the code provided to the Side Effect could fail, use an Activity.
Child Workflow
A Child Workflow Execution is a Workflow Execution that is spawned from within another Workflow.
A Workflow Execution can be both a Parent and a Child Workflow Execution because any Workflow can spawn another Workflow.
Parent and Child Workflow Execution entity relationship
A Parent Workflow Execution must await on the Child Workflow Execution to spawn.
The Parent can optionally await on the result of the Child Workflow Execution.
Consider the Child's Parent Close PolicyWhat is a Parent Close Policy?
If a Workflow Execution is a Child Workflow Execution, a Parent Close Policy determines what happens to the Workflow Execution if its Parent Workflow Execution changes to a Closed status (Completed, Failed, Timed out).
Learn more if the Parent does not await on the result of the Child, which includes any use of Continue-As-New by the Parent.
When a Parent Workflow Execution reaches a Closed status, the Cluster propagates Cancellation Requests or Terminations to Child Workflow Executions depending on the Child's Parent Close Policy.
If a Child Workflow Execution uses Continue-As-New, from the Parent Workflow Execution's perspective the entire chain of Runs is treated as a single execution.
Parent and Child Workflow Execution entity relationship with Continue As New
When to use Child Workflows
Consider Workflow Execution Event History size limits.
An individual Workflow Execution has an Event HistoryWhat is an Event History?
An append log of Events that represents the full state a Workflow Execution.
Learn more size limit, which imposes a couple of considerations for using Child Workflows.
On one hand, because Child Workflow Executions have their own Event Histories, they are often used to partition large workloads into smaller chunks.
For example, a single Workflow Execution does not have enough space in its Event History to spawn 100,000 Activity ExecutionsWhat is an Activity Execution?
An Activity Execution is the full chain of Activity Task Executions.
Learn more.
But a Parent Workflow Execution can spawn 1,000 Child Workflow Executions that each spawn 1,000 Activity Executions to achieve a total of 1,000,000 Activity Executions.
However, because a Parent Workflow Execution Event History contains EventsWhat is an Event?
Events are created by the Temporal Cluster in response to external occurrences and Commands generated by a Workflow Execution.
Learn more that correspond to the status of the Child Workflow Execution, a single Parent should not spawn more than 1,000 Child Workflow Executions.
In general, however, Child Workflow Executions result in more overall Events recorded in Event Histories than Activities. Because each entry in an Event History is a cost in terms of compute resources, this could become a factor in very large workloads. Therefore, we recommend starting with a single Workflow implementation that uses Activities until there is a clear need for Child Workflows.
Consider each Child Workflow Execution as a separate service.
Because a Child Workflow Execution can be processed by a completely separate set of WorkersWhat is a Worker?
In day-to-day conversations, the term Worker is used to denote both a Worker Program and a Worker Process. Temporal documentation aims to be explicit and differentiate between them.
Learn more than the Parent Workflow Execution, it can act as an entirely separate service.
However, this also means that a Parent Workflow Execution and a Child Workflow Execution do not share any local state.
As all Workflow Executions, they can communicate only via asynchronous SignalsWhat is a Signal?
A Signal is an asynchronous request to a Workflow Execution.
Learn more.
Consider that a single Child Workflow Execution can represent a single resource.
As all Workflow Executions, a Child Workflow Execution can create a one to one mapping with a resource. For example, a Workflow that manages host upgrades could spawn a Child Workflow Execution per host.
Parent Close Policy
A Parent Close Policy determines what happens to a Child Workflow ExecutionWhat is a Child Workflow Execution?
A Child Workflow Execution is a Workflow Execution that is spawned from within another Workflow.
Learn more if its Parent changes to a Closed status (Completed, Failed, or Timed out).
There are three possible values:
- Abandon: the Child Workflow Execution is not affected.
- Request Cancel: a Cancellation request is sent to the Child Workflow Execution.
- Terminate (default): the Child Workflow Execution is forcefully Terminated.
ParentClosePolicy
proto definition.
Each Child Workflow Execution may have its own Parent Close Policy. This policy applies only to Child Workflow Executions and has no effect otherwise.
Parent Close Policy entity relationship
You can set policies per child, which means you can opt out of propagating terminates / cancels on a per-child basis. This is useful for starting Child Workflows asynchronously (see relevant issue here or the corresponding SDK docs).
Temporal Cron Job
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Temporal Cron Job timeline
A Temporal Cron Job is similar to a classic unix cron job. Just as a unix cron job accepts a command and a schedule on which to execute that command, a Cron Schedule can be provided with the call to spawn a Workflow Execution. If a Cron Schedule is provided, the Temporal Server will spawn an execution for the associated Workflow Type per the schedule.
Each Workflow Execution within the series is considered a Run.
- Each Run receives the same input parameters as the initial Run.
- Each Run inherits the same Workflow Options as the initial Run.
The Temporal Server spawns the first Workflow Execution in the chain of Runs immediately.
However, it calculates and applies a backoff (firstWorkflowTaskBackoff
) so that the first Workflow Task of the Workflow Execution does not get placed into a Task Queue until the scheduled time.
After each Run Completes, Fails, or reaches the Workflow Run TimeoutWhat is a Workflow Run Timeout?
This is the maximum amount of time that a single Workflow Run is restricted to.
Learn more, the same thing happens: the next run will be created immediately with a new firstWorkflowTaskBackoff
that is calculated based on the current Server time and the defined Cron Schedule.
The Temporal Server spawns the next Run only after the current Run has Completed, Failed, or has reached the Workflow Run Timeout.
This means that, if a Retry Policy has also been provided, and a Run Fails or reaches the Workflow Run Timeout, the Run will first be retried per the Retry Policy until the Run Completes or the Retry Policy has been exhausted.
If the next Run, per the Cron Schedule, is due to spawn while the current Run is still Open (including retries), the Server automatically starts the new Run after the current Run completes successfully.
The start time for this new Run and the Cron definitions are used to calculate the firstWorkflowTaskBackoff
that is applied to the new Run.
A Workflow Execution TimeoutWhat is a Workflow Execution Timeout?
A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
Learn more is used to limit how long a Workflow can be executing (have an Open status), including retries and any usage of Continue As New.
The Cron Schedule runs until the Workflow Execution Timeout is reached or you terminate the Workflow.
Temporal Cron Job Run Failure with a Retry Policy
Cron Schedules
Cron Schedules are interpreted in UTC time by default.
The Cron Schedule is provided as a string and must follow one of two specifications:
Classic specification
This is what the "classic" specification looks like:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
│ │ │ │ │
* * * * *
For example, 15 8 * * *
causes a Workflow Execution to spawn daily at 8:15 AM UTC.
Use the crontab guru site to test your cron expressions.
robfig
predefined schedules and intervals
You can also pass any of the predefined schedules or intervals described in the robfig/cron
documentation.
| Schedules | Description | Equivalent To |
| ---------------------- | ------------------------------------------ | ------------- |
| @yearly (or @annually) | Run once a year, midnight, Jan. 1st | 0 0 1 1 * |
| @monthly | Run once a month, midnight, first of month | 0 0 1 * * |
| @weekly | Run once a week, midnight between Sat/Sun | 0 0 * * 0 |
| @daily (or @midnight) | Run once a day, midnight | 0 0 * * * |
| @hourly | Run once an hour, beginning of hour | 0 * * * * |
For example, "@weekly" causes a Workflow Execution to spawn once a week at midnight between Saturday and Sunday.
Intervals just take a string that can be accepted by time.ParseDuration.
@every <duration>
Time zones
This feature only applies in Temporal 1.15 and up
You can change the time zone that a Cron Schedule is interpreted in by prefixing the specification with CRON_TZ=America/New_York
(or your desired time zone from tz). CRON_TZ=America/New_York 15 8 * * *
therefore spawns a Workflow Execution every day at 8:15 AM New York time, subject to caveats listed below.
Consider that using time zones in production introduces a surprising amount of complexity and failure modes! If at all possible, we recommend specifying Cron Schedules in UTC (the default).
If you need to use time zones, here are a few edge cases to keep in mind:
- Beware Daylight Saving Time: If a Temporal Cron Job is scheduled around the time when daylight saving time (DST) begins or ends (for example,
30 2 * * *
), it might run zero, one, or two times in a day! The Cron library that we use does not do any special handling of DST transitions. Avoid schedules that include times that fall within DST transition periods.- For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes
1:59 … 1:00 … 1:01 … 1:59 … 2:00 … 2:01 AM
and any Cron jobs that fall in that 1 AM hour are fired again. The inverse happens when clocks "spring forward" for DST, and Cron jobs that fall in the 2 AM hour are skipped. - In other time zones like Chile and Iran, DST "spring forward" is at midnight. 11:59 PM is followed by 1 AM, which means
00:00:00
never happens.
- For example, in the US, DST begins at 2 AM. When you "fall back," the clock goes
- Self Hosting note: If you manage your own Temporal Cluster, you are responsible for ensuring that it has access to current
tzdata
files. The official Docker images are built with tzdata installed (provided by Alpine Linux), but ultimately you should be aware of how tzdata is deployed and updated in your infrastructure. - Updating Temporal: If you use the official Docker images, note that an upgrade of the Temporal Cluster may include an update to the tzdata files, which may change the meaning of your Cron Schedule. You should be aware of upcoming changes to the definitions of the time zones you use, particularly around daylight saving time start/end dates.
- Absolute Time Fixed at Start: The absolute start time of the next Run is computed and stored in the database when the previous Run completes, and is not recomputed. This means that if you have a Cron Schedule that runs very infrequently, and the definition of the time zone changes between one Run and the next, the Run might happen at the wrong time. For example,
CRON_TZ=America/Los_Angeles 0 12 11 11 *
means "noon in Los Angeles on November 11" (normally not in DST). If at some point the government makes any changes (for example, move the end of DST one week later, or stay on permanent DST year-round), the meaning of that specification changes. In that first year, the Run happens at the wrong time, because it was computed using the older definition.
How to stop a Temporal Cron Job
A Temporal Cron Job does not stop spawning Runs until it has been Terminated or until the Workflow Execution TimeoutWhat is a Workflow Execution Timeout?
A Workflow Execution Timeout is the maximum time that a Workflow Execution can be executing (have an Open status) including retries and any usage of Continue As New.
Learn more is reached.
A Cancellation Request affects only the current Run.
Use the Workflow Id in any requests to Cancel or Terminate.
Schedule
- Introduced in Temporal Server version 1.17.0
- Available in tctl v1.17 and Temporal CLI
- Available in Temporal Cloud via tctl and CLI
A Schedule contains instructions for starting a Workflow Execution at specific times.
Schedules provide a more flexible and user-friendly approach than Temporal Cron JobsWhat is a Temporal Cron Job?
A Temporal Cron Job is the series of Workflow Executions that occur when a Cron Schedule is provided in the call to spawn a Workflow Execution.
Learn more.
A Schedule has an identity and is independent of a Workflow Execution. This differs from a Temporal Cron Job, which relies on a cron schedule as a property of the Workflow Execution.
Action
The Action of a Schedule is where the Workflow Execution properties are established, such as Workflow Type, Task Queue, parameters, and timeouts.
Workflow Executions started by a Schedule have the following additional properties:
- The Action's timestamp is appended to the Workflow Id.
- The
TemporalScheduledStartTime
Search AttributeWhat is a Search Attribute?
A Search Attribute is an indexed name used in List Filters to filter a list of Workflow Executions that have the Search Attribute in their metadata.
Learn more is added to the Workflow Execution. The value is the Action's timestamp. - The
TemporalScheduledById
Search Attribute is added to the Workflow Execution. The value is the Schedule Id.
Spec
The Schedule Spec describes when the Action is taken. There are two kinds of Schedule Spec:
- A simple interval, like "every 30 minutes" (aligned to start at the Unix epoch, and optionally including a phase offset).
- A calendar-based expression, similar to the "cron expressions" supported by lots of software, including the older Temporal Cron feature.
These two kinds have multiple representations, depending on the interface or SDK you're using, but they all support the same features.
In tctl, for example, an interval is specified as a string like 45m
to mean every 45 minutes, or 6h/5h
to mean every 6 hours but at the start of the fifth hour within each period.
In tctl, a calendar expression can be specified as either a traditional cron string with five (or six or seven) positional fields, or as JSON with named fields:
{
"year": "2022",
"month": "Jan,Apr,Jul,Oct",
"dayOfMonth": "1,15",
"hour": "11-14"
}
The following calendar JSON fields are available:
year
month
dayOfMonth
dayOfWeek
hour
minute
second
comment
Each field can contain a comma-separated list of ranges (or the *
wildcard), and each range can include a slash followed by a skip value.
The hour
, minute
, and second
fields default to 0
while the others default to *
, so you can describe many useful specs with only a few fields.
For month
, names of months may be used instead of integers (case-insensitive, abbreviations permitted).
For dayOfWeek
, day-of-week names may be used.
The comment
field is optional and can be used to include a free-form description of the intent of the calendar spec, useful for complicated specs.
No matter which form you supply, calendar and interval specs are converted to canonical representations. What you see when you "describe" or "list" a Schedule might not look exactly like what you entered, but it has the same meaning.
Other Spec features:
Multiple intervals/calendar expressions: A Spec can have combinations of multiple intervals and/or calendar expressions to define a specific Schedule.
Time bounds: Provide an absolute start or end time (or both) with a Spec to ensure that no actions are taken before the start time or after the end time.
Exclusions: A Spec can contain exclusions in the form of zero or more calendar expressions. This can be used to express scheduling like "each Monday at noon except for holidays. You'll have to provide your own set of exclusions and include it in each schedule; there are no pre-defined sets. (This feature isn't currently exposed in tctl or the Temporal Web UI.)
Jitter: If given, a random offset between zero and the maximum jitter is added to each Action time (but bounded by the time until the next scheduled Action).
Time zones: By default, calendar-based expressions are interpreted in UTC. Temporal recommends using UTC to avoid various surprising properties of time zones. If you don't want to use UTC, you can provide the name of a time zone. The time zone definition is loaded on the Temporal Server Worker Service from either disk or the fallback embedded in the binary.
For more operational control, embed the contents of the time zone database file in the Schedule Spec itself. (Note: this isn't currently exposed in tctl or the web UI.)
Pause
A Schedule can be Paused. When a Schedule is Paused, the Spec has no effect. However, you can still force manual actions by using the tctl schedule trigger command.
To assist communication among developers and operators, a “notes” field can be updated on pause or resume to store an explanation for the current state.
Backfill
A Schedule can be Backfilled.
When a Schedule is Backfilled, all the Actions that would have been taken over a specified time period are taken now (in parallel if the AllowAll
Overlap Policy is used; sequentially if BufferAll
is used).
You might use this to fill in runs from a time period when the Schedule was paused due to an external condition that's now resolved, or a period before the Schedule was created.
Limit number of Actions
A Schedule can be limited to a certain number of scheduled Actions (that is, not trigger immediately). After that it will act as if it were paused.
Policies
A Schedule supports a set of Policies that enable customizing behavior.
Overlap Policy
The Overlap Policy controls what happens when it is time to start a Workflow Execution but a previously started Workflow Execution is still running. The following options are available:
Skip
: Default. Nothing happens; the Workflow Execution is not started.BufferOne
: Starts the Workflow Execution as soon as the current one completes. The buffer is limited to one. If another Workflow Execution is supposed to start, but one is already in the buffer, only the one in the buffer eventually starts.BufferAll
: Allows an unlimited number of Workflows to buffer. They are started sequentially.CancelOther
: Cancels the running Workflow Execution, and then starts the new one after the old one completes cancellation.TerminateOther
: Terminates the running Workflow Execution and starts the new one immediately.AllowAll
Starts any number of concurrent Workflow Executions. With this policy (and only this policy), more than one Workflow Execution, started by the Schedule, can run simultaneously.
Catchup Window
The Temporal Cluster might be down or unavailable at the time when a Schedule should take an Action. When it comes back up, the Catchup Window controls which missed Actions should be taken at that point. The default is one minute, which means that the Schedule attempts to take any Actions that wouldn't be more than one minute late. An outage that lasts longer than the Catchup Window could lead to missed Actions. (But you can always Backfill.)
Pause-on-failure
If this policy is set, a Workflow Execution started by a Schedule that ends with a failure or timeout (but not Cancellation or Termination) causes the Schedule to automatically pause.
Note that with the AllowAll
Overlap Policy, this pause might not apply to the next Workflow Execution, because the next Workflow Execution might have started before the failed one finished.
It applies only to Workflow Executions that were scheduled to start after the failed one finished.
Last completion result
A Workflow started by a Schedule can obtain the completion result from the most recent successful run. (How you do this depends on the SDK you're using.)
For overlap policies that don't allow overlap, “the most recent successful run” is straightforward to define.
For the AllowAll
policy, it refers to the run that completed most recently, at the time that the run in question is started.
Consider the following overlapping runs:
time -------------------------------------------->
A |----------------------|
B |-------|
C |---------------|
D |--------------T
If D asks for the last completion result at time T, it gets the result of A. Not B, even though B started more recently, because A completed later. And not C, even though C completed after A, because the result for D is captured when D is started, not when it's queried.
Failures and timeouts do not affect the last completion result.
Last failure
A Workflow started by a Schedule can obtain the details of the failure of the most recent run that ended at the time when the Workflow in question was started. Unlike last completion result, a successful run does reset the last failure.
Limitations
The Scheduled Workflows feature is available in Temporal Server version 1.18.
Internally, a Schedule is implemented as a Workflow. If you're using Advanced Visibility (Elasticsearch), these Workflow Executions are hidden from normal views. If you're using Standard Visibility, they are visible, though there's no need to interact with them directly.
Native support for Schedules in language SDKs is coming soon.
For now, tctl
and the web UI are the main interfaces to Schedules.
For advanced use, you can also use the gRPC API by getting a WorkflowServiceClient
object from the SDK and calling methods such as CreateSchedule
.
- The Cluster usually deduplicates Signals, but does not guarantee deduplication: During shard migration, two Signal Events (and therefore two deliveries to the Workflow Execution) can be recorded for a single Signal because the deduping info is stored only in memory.↩