What are Feature Flags?

Devops & Infrastructure, Tips & Tricks, and What Is

What are Feature Flags?

Feature flags are conditional toggles in your codebase that let you turn features on or off at runtime without redeploying. They decouple two things that used to be inseparable: shipping code to production and releasing functionality to users. Once you can deploy a feature in the off state and flip it live later, the entire risk profile of a release changes.

Below: what feature flags actually are, the five canonical patterns you'll use them for, the implementation tradeoffs that bite teams in year two, and how they fit into a modern deployment workflow.

The core idea: separate deploy from release

In a traditional release, a deploy is the release — the moment your code reaches production, users see the new behaviour. That coupling forces a choice: ship slow and safe with long-lived branches, or ship fast and accept that any bad commit is immediately user-facing.

Feature flags break that coupling. Code can ship to production every day, fully merged to main, while still hidden behind an if (flags.newCheckout) { ... } check. The deploy is just infrastructure plumbing. The release is a separate, deliberate act — flipping a config value, often without touching the codebase at all.

This is what makes feature flags a foundational pattern for trunk-based development and short-lived branches. When unfinished work can be merged safely behind a flag, you stop needing weeks-long feature branches that drift out of sync with main.

A minimal implementation

The simplest possible feature flag is a boolean read from configuration:

<?php
// Read flag state from environment, database, or a flag service
$isNewCheckoutEnabled = getFlag('new_checkout', $userId);

if ($isNewCheckoutEnabled) {
    renderNewCheckout();
} else {
    renderLegacyCheckout();
}

Three things to notice:

  1. The flag is read at request time, not at boot. That lets you flip behaviour without restarting the application.
  2. The check accepts context (here, $userId). The same flag can be true for one user and false for another — that's how you do percentage rollouts and targeted releases.
  3. Both code paths are present in the binary. The old behaviour isn't deleted; it's still selectable. That's what makes the rollback instant.

In a real codebase the getFlag() call usually delegates to a centralised evaluator — either an in-house library backed by a config table, or a managed service like LaunchDarkly, ConfigCat, Split, Flagsmith, or the open-source Unleash. We'll cover the build-vs-buy tradeoff later.

Five canonical use cases

Feature flags get used for a lot of things, but in practice almost every flag falls into one of five categories. Naming them matters — the same boolean variable means very different things depending on which job it's doing, and conflating them creates flag debt.

1. Release toggles (the off switch for unfinished work)

The most common pattern: a flag that hides in-flight work so it can be merged to main without being visible to users. The flag exists for the duration of the development cycle, gets turned on when the feature ships, and is then deleted from the code within a sprint or two. These are short-lived by design.

2. Kill switches (operational toggles)

A flag wrapped around a feature you suspect might break — payment retries, a third-party integration, an expensive query path. When something goes wrong in production at 2am, you flip the kill switch instead of rolling back a deploy. Kill switches are usually long-lived; they exist for the lifetime of the risky subsystem.

3. Canary releases and gradual rollouts

Instead of a binary on/off, the flag evaluates to true for a percentage of traffic — 1%, then 5%, then 25%, then 100%. Each step gives you real production data before the blast radius widens. This is the same idea as canary deployments at the infrastructure level, but pushed down into the application layer so it can be controlled per-feature instead of per-deploy.

4. A/B experiments

The flag becomes part of an experiment: bucket users into variants, measure a metric (conversion, time-on-page, error rate), and decide which variant wins. Experiments are time-bounded — once the result is in, the losing variant gets deleted.

5. Entitlement / permission flags

The flag encodes who can use a feature: paid plans, beta cohorts, internal staff, specific regions. These are long-lived business logic, not deployment artefacts. They tend to outlive everything else and are often best moved out of your flag system into an explicit entitlements service once the count gets high.

The trap: treating all five the same. Release toggles and entitlement flags have opposite lifecycles, opposite ownership, and opposite testing requirements. Mixing them in one bucket is how teams end up with thousands of stale flags nobody can safely delete.


Already running CI/CD? DeployHQ deploys flagged code from your repo to production in the off state, so you can ship to main daily and release on your own schedule. Start a free 10-day trial — no credit card needed.


The implementation tradeoffs nobody warns you about

The first 10 flags are easy. The problems start later.

Flag debt

Every release toggle is meant to be temporary. In practice, the feature ships, the team moves on, and the if (flags.newCheckout) check stays in the code forever — both branches maintained, both tested, both adding complexity. Three years in, a typical mature codebase has hundreds of dead flags that everyone is afraid to remove because no one is sure if the off path is still reachable.

The mitigation is process, not tooling: every release toggle gets an owner and an expiry date when it's created, and removing it is part of the definition of done for the feature it gated.

Evaluation latency

Reading a flag has to be fast. If the lookup hits a remote service on every request, you've added a network call to the hot path of every page load. Production flag systems solve this with in-memory evaluation — the SDK streams flag rules to the client, evaluates locally, and only reports usage back asynchronously. If you're building your own, this is the part that's harder than it looks. Stale-but-fast almost always beats fresh-but-slow.

Configuration drift

Flag state usually differs across environments — on in staging, off in production, 50% rollout in canary. Without a single source of truth, a developer fixes a bug locally that they can't reproduce in staging because the flag is in a different state. Treat flag configuration the same way you treat infrastructure: version-controlled where possible, with an audit log of who flipped what and when.

Testing the matrix

Every flag doubles the number of code paths in theory. In practice, you can't test every combination of N flags — that's 2N states. The discipline is to test each flag's on and off paths in isolation, and to keep flag interactions explicit (one flag shouldn't silently depend on another). Long-lived flags need both paths covered by tests; otherwise the off branch quietly rots.

Missing observability

A flag that nobody is watching is worse than no flag at all. You need to know: which flags are evaluated, how often, by which users, and what the outcomes are. When a kill switch saves you at 2am, the value isn't the boolean — it's the metric that told you to flip it. Wire flag evaluations into your existing logging and metrics pipeline from day one.

Build vs buy

The honest version of this decision:

Build it yourself when your needs are small (under ~50 flags), your team has spare engineering capacity, and you don't need user-targeting or experimentation. A boolean column in a config table plus a thin wrapper gets you 80% of the value. Open-source Unleash gives you self-hosted infrastructure if you want managed UI without vendor lock-in.

The same logic applies to picking between continuous delivery and continuous deployment — start with what your release cadence actually needs, not the most advanced setup you can imagine.

Buy a managed service when you need percentage rollouts, user-attribute targeting, experiment analytics, audit logs for compliance, or SDKs across multiple languages. LaunchDarkly, ConfigCat, Split, and Flagsmith all cover this space; the differences come down to pricing model, SDK ergonomics, and whether you need experimentation built in.

The common mistake is buying too early — spending months integrating a managed platform before you have enough flags to justify it — or buying too late and trying to migrate hundreds of homegrown flags into a vendor format under deadline pressure.

Where feature flags fit in a deployment workflow

Feature flags don't replace deployment automation; they sit on top of it. A modern release pipeline looks like:

  1. Code merges to main behind a flag (flag = off).
  2. CI runs and the automated deployment pipeline ships the build to production.
  3. The flag stays off for everyone except internal testers.
  4. QA and product validate in production with real data.
  5. The flag rolls out gradually — 1% → 10% → 50% → 100%.
  6. The flag is removed from the code once the feature is stable.

This is the workflow that continuous deployment was always aiming for: every commit to main is production-ready, deployments happen many times a day, and the business decision about when users see something is decoupled from the engineering decision about when code ships.

If you're still working out which pieces of that pipeline you have and which you're missing, our complete guide to CI/CD pipelines covers the deployment automation side end-to-end. Once that's in place, layering feature flags on top is a few days of work, not a quarter.

A practical starting point

If you've never used feature flags before, don't start by buying a platform. Start by wrapping a single risky feature in a boolean flag read from your existing configuration system. Ship it off, validate it in production, flip it on. Then do it again with the next risky feature.

After ~10 flags, you'll know whether you need a managed platform — and you'll know which capabilities (percentage rollouts? targeting? experiments?) actually matter for your team, instead of paying for ones you don't.

The goal isn't to use feature flags everywhere. The goal is to make the riskiest changes invisible until you're confident they shouldn't be — and to give yourself an instant off switch when you're wrong.


Questions about deploying flagged code safely? Email us at support@deployhq.com or follow @deployhq on X.