Distributed Transactions - Why 'Immediate Consistency' is a Luxury

In a monolith, a simple BEGIN and COMMIT transaction in SQL is enough. In microservices, it's a mental battle to ensure data doesn't "float away."

The Nightmare of Inconsistency

Imagine an order flow:

Order Service: Creates an order.
Payment Service: Deducts money (calls a bank API).
Inventory Service: Deducts stock.

What if step 3 fails because the item is out of stock, but step 2 has already deducted the customer's money? In a single database, rolling back is extremely simple. In microservices, each step lives in a different database. Welcome to the world of Distributed Transactions.

1. Trying 2PC (Two-Phase Commit) - The First Mistake

I once tried to force databases to "wait" for each other. Result: The system was incredibly slow. A network bottleneck in one service would lock all other services participating in the transaction. In a Cloud environment, the network is never 100% stable. 2PC is a bad idea for modern microservice architecture.

2. Saga Pattern: The Complex Savior

A Saga is a way to break a large transaction into multiple small steps. If a step fails, you must run Compensating Transactions (undo actions) to return to the previous state. For example: If deducting stock fails, call an API to the Payment Service to... refund the money.

But the hard part is:

Code becomes extremely messy because "undo" logic is everywhere.
You have to handle cases like: What if the "refund" API also fails? (You have to retry, or log it for manual intervention).

3. Accepting Eventual Consistency

This is the biggest mindset shift. Instead of trying to keep data correct immediately at the same time (Strong Consistency), I accept that the data will be correct "after a period of time."

Reality: When a customer clicks "Order," the system returns "Processing." A background worker goes through each step. If any step fails, it self-corrects or notifies an admin. Customers are actually very used to this (just look at how Grab or Shopee handle orders).

Lessons Learned for Small Teams

Avoid distributed transactions at all costs: If two data entities always need to go together, they should probably live in the same service instead of being split.
Outbox Pattern: Save events to the same database as the business logic, then a separate process reads this table and pushes the event. This ensures: if the data is saved successfully, the event will definitely be sent.
Idempotency (more in the next post): Services must be able to handle receiving the same request multiple times without causing data discrepancies.

Don't try to build a perfect system that never fails. Build a system that can recover itself when errors occur.

Conclusion

Consistency in Microservices is a spectrum, not black and white. Choosing the level of consistency that fits the actual business will help your team survive the turbulent early stages of development.

The Nightmare of Inconsistency

Imagine an order flow:

Order Service: Creates an order.

Payment Service: Deducts money (calls a bank API).

Inventory Service: Deducts stock.

1. Trying 2PC (Two-Phase Commit) - The First Mistake

2. Saga Pattern: The Complex Savior

But the hard part is:

Code becomes extremely messy because "undo" logic is everywhere.

You have to handle cases like: What if the "refund" API also fails? (You have to retry, or log it for manual intervention).

3. Accepting Eventual Consistency

This is the biggest mindset shift. Instead of trying to keep data correct immediately at the same time (Strong Consistency), I accept that the data will be correct "after a period of time."

Lessons Learned for Small Teams

Avoid distributed transactions at all costs: If two data entities always need to go together, they should probably live in the same service instead of being split.

Outbox Pattern: Save events to the same database as the business logic, then a separate process reads this table and pushes the event. This ensures: if the data is saved successfully, the event will definitely be sent.

Idempotency (more in the next post): Services must be able to handle receiving the same request multiple times without causing data discrepancies.

Don't try to build a perfect system that never fails. Build a system that can recover itself when errors occur.