Software architecture would be easy if the world only had happy paths, but real-world applications are complex. Modern architectures have many moving pieces, paths, and services that interact with each other. They must also handle different conditions and errors.
One effective way to manage these complexities is through orchestration.
What is Orchestration?
Orchestration is a pattern used to coordinate the execution of distributed services to complete a business process.
It is typically used when multiple independent services must work together to complete a task. Some call them orchestrated transactions.
Why Do We Need Orchestration?
In modern software systems, especially those built on microservices architectures, workflows often involve multiple services that must work together.
Each service may have its own data and business logic, and to complete a workflow, the system needs to coordinate interactions among these services.
Without a central mechanism to manage these interactions, the complexity and potential for errors increase significantly.
Orchestration addresses these challenges by providing a centralized control mechanism that:
- Manages the workflow
- Make sure to call each service in the correct order
- Handle errors appropriately
This centralized control makes it easier to manage complex workflows, handle errors, and ensure consistency across services.
Key Characteristics of Orchestration
Centralized Control
Orchestration provides a single point of control for all workflow decisions.
This central entity is The Orchestrator. It manages the sequence of service invocations and controls the logic for error recovery, retries, and state management.
Service Interaction
The orchestrator interacts with many services, calling them in a specific order and handling the data flow between them.
This interaction can be synchronous (waiting for a response) or asynchronous (continuing without waiting for a response). (More on this below)
State Management
The orchestrator maintains the state of the workflow.
It tracks which steps have been completed, their current status, and what needs to be done next.
This state management is crucial for handling complex workflows and ensuring consistency.
Extensibility
You can extend the orchestrated workflows by adding or modifying new services and updating the central orchestrator to accommodate changes without significant disruption to the system.
Error Handling and Compensation
Orchestration allows for sophisticated error handling and compensation logic.
If something goes wrong, the orchestrator can:
- Execute retries
- Use alternative workflows to handle the failure
- Rollback operations
This is one of the areas where orchestration shines in system design.
Let's explore a real-world example.
Handling Paths with Orchestration: A Practical Example
In this example, we explore how orchestration can manage the workflow of an e-commerce system involving many services.
For this example, we can focus on
- Orders
- Shipping
- Notifications
The Happy Path
- Client Places an Order: The process begins when the client sends a "Place Order" request to the Order Orchestrator.
- Order Creation: The Order Orchestrator receives the request and forwards it to the Orders Service to create the order. The Orders Service processes the request and creates a new order record in its database.
- Shipping Request: Once the Orders Service confirms the order creation, the orchestrator asynchronously calls the Shipping Service to start the shipping process. This is done because shipping the order takes time and doesn't need to block the workflow.
- Order Shipped: The Shipping Service processes the shipping request. After the order is shipped, it notifies the orchestrator about the successful shipment.
- Order Status Update: Upon receiving the shipping confirmation, the orchestrator updates the order's status to "Shipped" in the Orders Service.
- Notification: Finally, the orchestrator calls the Notifications Service to send a Success email notification informing the client that their order has been shipped.
The Not-So-Happy Path (Alternative Workflow)
- Client Places an Order: The client sends a "Place Order" request to the Order Orchestrator.
- Order Creation: The Order Orchestrator forwards the request to the Orders Service, which creates the order.
- Shipping Request: The orchestrator asynchronously calls the Shipping Service to start the shipping process.
- Backorder Notification: The Shipping Service finds that there is not enough inventory to fulfill the order and notifies the orchestrator about the backorder status.
- Order Status Update: The orchestrator updates the order's status to "BackOrdered" in the Orders Service to reflect the issue.
- Notification: The orchestrator calls the Notifications Service to send a backorder notification to the client, informing them about the issue with their order.
The Failed Path (Compensating Transactions)
When an error occurs during a transaction, it may be necessary to undo previously completed steps to maintain system consistency.
We called these compensating transactions. For example, in systems with no support for backorder, this will look like this:
- Client Places an Order: The client sends a "Place Order" request to the Order Orchestrator.
- Order Creation: The orchestrator calls the Orders Service to create the order.
- Shipping Request: The orchestrator calls the Shipping Service to process the shipment.
- Inventory Issue: The Shipping Service detects insufficient inventory to fulfill the order and notifies the orchestrator.
- Compensating Transactions Initiated:
- Cancel Order: The orchestrator calls the Orders Service to cancel the order.
- Revert Payment: If payment was processed, the orchestrator might need to send a message to the Payment Service to refund the amount to the client.
- Order Status Update: The orchestrator updates the order's status to "Canceled" in the Orders Service.
- Notification: The orchestrator sends a cancellation notification via the Notifications Service.
Not All The Operations Are Synchronous
As mentioned, this interaction can be asynchronous.
While the concept of orchestration screams a centralized controller and steps, it's important to note that not all orchestrators operate synchronously.
Event-driven orchestration is an architecture you can use to implement Microservices Orchestration in a totally asynchronous way.
The orchestrator is still responsible for workflow logic and failures in both. The only thing that changes is how the orchestrator communicates with downstream microservices.
Event-Driven Workflows Offer Several Benefits:
- You can use the same monitoring and scaling functionality for the orchestrator as other event-driven microservices.
- Events can be consumed by other services, including those outside the orchestration.
- The orchestrator and the dependent services are isolated from each other's intermittent failures.
- Have a built-in retry mechanism for failures, as events can remain in the broker for retrying.
Event-driven could be your winning card for scalability, performance, and resilience.
The orchestrator can continue processing other tasks while waiting for service responses.
This approach is particularly valuable in distributed systems where service latencies can vary, and workloads can be unpredictable.
The orchestrator has the responsibility to materialize and keep the status of the events.
In this case, event 123 (This is a representation of an order for example) has been successfully processed in this case, while event IDs 124 and 125 are in different workflow stages.
The orchestrator can make decisions based on these results and select the next step according to the workflow logic.
Once the events are processed, the orchestrator can also take the necessary data from services A, B, and C's results and compose the final output.
Assuming the operations in services A, B, and C are independent of one another, you can change the workflow by changing the order you send the events.
Also, take into account that there will be plenty of opportunities to combine Direct-Call and Event-Driven within the same orchestrator, and this combination will be your sweet spot.
Benefits and Drawbacks of Orchestration
Like everything in systems design, Orchestration comes with its own trade-offs. Let's talk about these:
Orchestration Benefits
The orchestrator acts as a centralized entity where you implement all the behaviors, paths, and error-handling logic.
This centralization simplifies the management of workflows by providing a single place to manage all interactions and decisions.
The orchestrator can incorporate retry logic to take care of temporary service outages. If a service is momentarily unavailable, the orchestrator can retry the request after a specified interval, improving the system's resilience and reliability.
This retry mechanism allows the system to degrade gracefully rather than fail outright, enhancing the user experience during transient failures.
The orchestrator maintains the state of the workflow, making it easy to query the current status of an ongoing process. This state management is critical for:
- Monitoring
- Debugging
- Understanding the workflow's progress
Orchestration Drawbacks
Since all communication must go through the orchestrator, it can become a bottleneck, especially in high-throughput scenarios.
The added orchestration layer introduces some performance overhead due to the additional communication and processing required. This can impact the system's latency and throughput.
The orchestrator is a critical component of the workflow. If it fails, it will disrupt the entire workflow, making it a single point of failure.
However, you can address this with redundancy strategies, such as deploying many instances of the orchestrator and using load balancing to distribute the load.
Key Takeaways
- Orchestration manages workflows from one place, making it easier to handle different paths and errors.
- Orchestrators can retry tasks if a service temporarily fails, improving system reliability.
- Orchestrators keep track of the workflow's status, making it easy to check progress and debug issues.
- Asynchronous Orchestration can handle multiple tasks at once without waiting for responses, making the system more scalable. Tasks can run in parallel, speeding up the overall process.
- The orchestrator can slow down the system if it becomes overloaded.
- If the orchestrator fails, it can stop the whole workflow. Redundancy is necessary to avoid this.
In simple terms, Orchestration puts something in charge: the Orchestrators. And it is aware of the entire workflow.
If there is any problem during the process, the Orchestrator will know and take action to handle it or will simply notify a failure.
Now that you have the tool, How have you handled complex workflows in your distributed applications?