LogoTRUONG PHAM
Home
Projects
Blogs
YouTube
Contact

Newsletter

Stay updated with technical artifacts and engineering insights.

LogoTRUONG PHAM

Building scalable software and sharing insights on technology & life.

Sitemap

  • Home
  • Projects
  • Blogs
  • YouTube
  • Contact

Connect

  • GitHub
  • LinkedIn
  • Email
  • YouTube

© 2024 TRUONG PHAM. © All rights reserved.

Privacy PolicyTerms of Service
Back
Blog #46: Feature Flag – The 'Lifebuoy' for Risky Releases
50 FRONTEND LESSONS – HARD-EARNED EXPERIENCES

Blog #46: Feature Flag – The 'Lifebuoy' for Risky Releases

Analyzing the decision to implement a Feature Flag system for a financial project with a team of 12 and the pressure of no room for error.

TP
Truong PhamSoftware Engineer
PublishedSeptember 1, 2024

I once led a group of 12 developers building a new Core Banking system. The deadline was 3 weeks away, and we had to replace the entire "Quick Transfer" flow—an extremely sensitive feature. The project's scale was hundreds of thousands of transactions per day. The pressure then wasn't just to complete, but to ensure that if any error occurred on Production, we had a way to roll back immediately without spending 30 minutes to build and deploy again from scratch.

The Problem: "Holding Breath" When Hitting the Deploy Button

The biggest problem was the difference between the Staging and Production environments. Despite thorough testing, we couldn't anticipate how the system would behave under actual load with millions of database records.

We needed a mechanism to:

  1. Deploy new code without making it immediately visible to all users.
  2. Be able to "turn off" the new feature and go back to the old one in just 1 second if a bug was detected.

Options Considered

We stood between 2 choices:

Option 1: Git Branching Strategy (Traditional)

  • Solution: Keep the old feature on the master branch, new feature on the feature/new-transfer branch. If a bug occurs, revert the code and re-deploy.
  • Pros: No additional development cost for a management system.
  • Cons: Long rollback time (at least 15-20 minutes for CI/CD to run). Extremely high risk of merge conflicts when many people work together.

Option 2: Feature Flag Implementation (Modern Architecture)

  • Solution: Deploy both old and new code simultaneously. Use a configuration variable from the server to decide which code will be executed.
  • Pros: Instant rollback (just change the variable value on a Dashboard). Allows for Canary Release (only opening to 1-5% of users for testing).
  • Cons: Code will become more cluttered as it must be wrapped by if/else statements. Needs an additional Flag manager (like LaunchDarkly or self-built).

Final Decision and Analysis

I decided to choose Option 2. The safety of the financial system was more important than the beauty of the code at that time.

// Pseudo-code of Feature Flag implementation
const MoneyTransfer = () => {
  const { isEnabled } = useFeatureFlag('new_transfer_flow');

  if (isEnabled) {
    return <NewTransferFlow />;
  }
  
  return <OldTransferFlow />;
};

Impact on Performance: There is a small latency (a few ms) to fetch the Flag value from the server. We optimized by caching the Flag in LocalStorage or embedding it directly into the initial sitemap payload.

Impact on Maintainability: This is a burden. The code contains more "garbage" statements. I had to set a rule: Immediately after the new feature stabilizes (after 1 week), the team must have a "Cleanup" task to remove the old code and the if/else statements related to that Flag.

Impact on Team: Juniors initially found it troublesome to write code in 2 places. But after witnessing a Logic bug fixed with just one click to turn off the Flag in 5 seconds, they understood the value of this "defensive" approach.

Self-Reflection: Was it Over-engineering?

Many would say: "Just write clean code, test it well, why bother with Feature Flags?". But in reality, in an Enterprise environment, nothing is 100% certain.

If I were starting over, I would still choose Feature Flags. In fact, I would apply them earlier for every major feature. It's not just a technical tool; it's a "tranquilizer" for both leadership and technical teams during risky releases.


Notes on the day we learned how to control our own risks.

Series • Part 46 of 50

50 FRONTEND LESSONS – HARD-EARNED EXPERIENCES

NextBlog #47: Long-term Tab Crash – What happens when Users never close their Tabs?
Blog #45: The Brutal Difference Between Staging and Production
41Blog #41: The 2 AM Panic and Infinite Question Marks42Blog #42: When the Backend Changes the Schema and the Fragility of the Frontend43Blog #43: CORS Isn't Always the Backend's Fault44Blog #44: CI/CD and Environment Variables that 'Disappear' Without a Trace45Blog #45: The Brutal Difference Between Staging and Production46Blog #46: Feature Flag – The 'Lifebuoy' for Risky ReleasesReading47Blog #47: Long-term Tab Crash – What happens when Users never close their Tabs?48Blog #48: Slow AI UX Fail – When User Experience can't wait for Artificial Intelligence49Blog #49: Surviving Black Box API – When you have to live with 'Instability'50Blog #50: Frontend Is Not Easy – When Simplicity is the Ultimate Sophistication
TP

Written by Truong Pham

Software Engineer passionate about building high-performance systems and meaningful experiences.

Read more articles