Feature flags solved a real problem for us: we could merge half-finished work to main behind a flag, keep deploying daily, and turn the feature on only when it was ready. Deploy and release became separate decisions. That is the upside, and it is large. The downside arrives quietly, three months later.
Two kinds of flag, do not confuse them
A release flag exists to hide unfinished work and has a lifespan of days or weeks. An operational flag - a kill switch for a flaky dependency, a throttle - is meant to live forever. The mistake we kept making was treating release flags as if they were permanent, leaving them in the code long after the feature was fully rolled out.
- Release flags: short-lived, scheduled for removal the moment rollout hits 100%
- Operational flags: long-lived, documented, owned by whoever runs the service
- Every release flag gets a removal ticket created at the same time it is added
- A flag with no owner and no expiry is debt with a countdown you cannot see
The debt is invisible until it bites
Each flag doubles the number of code paths in principle. Ten release flags left lying around mean your code can be in any of a thousand states, and most of those states are never tested. We found a bug once that only appeared when two old flags were in a combination no real account had - except one, after a migration. The fix was deleting both flags, which should have happened months earlier.
An old feature flag is not a safety net, it is an untested branch that nobody remembers writing.
Make removal cheaper than keeping
We now treat flag cleanup as part of finishing a feature, not a separate chore that never gets prioritized. When a release flag reaches full rollout, removing it - the flag, the config, the dead branch - is the last commit of that piece of work. It feels like overhead in the moment. It is the cheapest it will ever be, and it only gets more expensive the longer the conditional sits there pretending to matter.