Database Separation - A Painful but Necessary 'Divorce'
Following post 3, when the technical debt from shared databases is due. How do you separate data without crashing the running system?
In post 3, I admitted we shared a database to meet a deadline. Two months later, that "debt" started demanding repayment with interest. This is how we executed this data divorce.
Why Separate?
When Service A and Service B share a database, you start encountering "laughable yet tearful" situations:
- A small change in A's table breaks B's code (because B is also querying that table).
- The DBA cannot optimize the database for A without affecting B's performance.
- Most importantly: Business boundaries are blurred. Devs just
JOINtables from different services because it's... convenient.
We knew we had to separate—immediately.
1. Phase 1: Logical Separation (Code Level Separation)
Don't rush into creating a new DB instance right away. The first step is to ban all cross-query actions at the code level.
- Service B is not allowed to
SELECTfrom Service A's table. - If B needs data from A, B must call A's API.
The pain: Performance drops noticeably because a simple JOIN is now 2-3 API calls. But this is the boundary that must be established to move forward.
2. Phase 2: Data Migration
This is the scariest part. How do you move millions of data rows to a new database without stopping the system? We used the "Double Write" strategy:
- Enable writing to both places: Code will write data to both the old DB and the new DB.
- Migrate old data: Run a script to gradually move old data from the old DB to the new one (Background job).
- Verify consistency: Compare data on both sides to ensure no discrepancies.
- Switch Read access: Update the code so the service now reads from the new DB.
- Stop writing to the old DB: Complete the "divorce."
3. The Biggest Challenge: Reporting
When data is scattered across 5 different databases, how do you generate consolidated reports?
Now you can't just JOIN to find out "Which user has bought the most orders?".
Options for small teams:
- Simple Data Warehouse: Push all data from services to a common database dedicated for reading (Read-only) using simple ETL tools or CDC (Change Data Capture).
- Avoid running reports directly from the Production database: This is a golden rule. Microservices help you separate business logic, so let the reporting layer live separately.
Lessons Learned
- Data has gravity: Moving data is always many times harder and riskier than moving code. Prepare your Rollback plans carefully.
- Don't separate too early, but don't separate too late: Separate when business boundaries are clear. If you separate too early without understanding the domain, you'll spend forever moving data back and forth between services.
- Think in APIs, not Tables: Get into the habit of asking: "What data does this service provide?" instead of "What table does this service use?".
Microservices aren't about how many services you have, but about how independent your data is.
Conclusion
Separating databases is the most painful step toward a true microservice system. It forces you to face the reality that: Freedom always comes with the price of complexity in data management.
Series • Part 8 of 20