Deploying a new feature to 100% of users in one shot is a coin flip with your error budget. Canary releases flip that math: roll the new version to a small slice of traffic first, watch the error rate, then expand β or roll back before anyone notices.
This guide is code-centric. Instead of reaching for a service mesh or a managed canary controller, we'll show how to run progressive canary releases using percentage-based feature flags in PHP and weighted upstreams in Nginx and Apache β the kind of thing you can ship on the infrastructure you already have.
What is a Canary Release?
A canary release is a deployment pattern that exposes a new version of an application to a small subset of users while the previous version keeps serving everyone else. The name comes from the canaries miners used to detect dangerous gases β a small, expendable signal of trouble before the whole crew is in danger.
The pattern, described by Martin Fowler in his canary release write-up (linked at the end of this article), gives you three things at once:
- Reduced blast radius. A bug that ships to 5% of traffic affects 5% of users, not 100%. If your error rate doubles on the canary cohort but stays flat overall, you have a clean signal and a controlled fallout.
- Real production feedback. Staging environments don't have your traffic mix, your data volume, or your edge cases. A canary gets you real-world telemetry β latency under real load, integration behavior against real third-party APIs β without committing to a full rollout.
- Fast, partial rollback. Reverting 5% of traffic is dramatically cheaper than rolling back a release that's already in front of every user. Most code-centric canary setups can flip the dial from 5% β 0% in seconds.
Canary releases pair naturally with trunk-based development and short-lived branches β both patterns optimize for small, frequent, low-risk changes instead of big-bang merges.
When to Use Canary vs Other Deployment Strategies
Canary isn't always the right tool. Quick decision guide:
| Strategy | Best for | Cost | Rollback speed |
|---|---|---|---|
| Canary | High-traffic apps where you want gradual exposure and rich live telemetry | Lowβmedium (one extra pool) | Seconds (drop weight to 0) |
| Blue/Green | Backwards-compatible releases where you want a clean cutover | High (full duplicate environment) | Seconds (flip the LB) |
| Rolling | Stateless services running on a cluster | Low | Minutes (drain + redeploy) |
| Big-bang | Internal tools, very low-traffic services | Lowest | Slowest (redeploy old version) |
If you want the full side-by-side, see our breakdown of Blue/Green, Canary, Rolling, and Atomic zero-downtime strategies β it covers cost trade-offs and a decision matrix for picking the right one.
Managing Canary Releases in Your Code
The most flexible way to run a canary is to put the routing logic directly into your application β typically through feature flags with percentage-based or rule-based activation, combined with smart load-balancer weights for traffic shifting.
1. Feature flags for progressive rollout
At the core of a code-centric canary release is the feature flag. Instead of a binary ON/OFF, a flag carries a rollout percentage β and routes each user to either the new or old code path based on a stable hash of their ID.
If you're new to the pattern, our primer on what feature flags are and how teams use them covers the broader concept; the example below shows the canary-specific shape.
Example: PHP percentage-based feature flag
<?php
// Configuration (e.g., from a database, config service, or environment variable)
// Represents the percentage of users who should see the new feature.
$featureRolloutPercentages = [
'new_checkout_flow' => (int)getenv('NEW_CHECKOUT_ROLLOUT_PERCENTAGE') ?: 0, // Default to 0%
];
function shouldEnableFeature($featureName, $userId = null) {
global $featureRolloutPercentages;
$rolloutPercentage = $featureRolloutPercentages[$featureName] ?? 0;
if ($rolloutPercentage === 100) {
return true; // Fully enabled
}
if ($rolloutPercentage === 0) {
return false; // Fully disabled
}
// For progressive rollout, use a consistent hash for the user
// This ensures the same user consistently sees the same version
$hashValue = crc32($userId ?? uniqid()); // Use user ID if available, else a unique ID for session
$percentage = ($hashValue % 100) + 1; // Gives a number between 1 and 100
return $percentage <= $rolloutPercentage;
}
// Example Usage in a Controller:
$currentUserId = $_SESSION['user_id'] ?? null; // Assume user ID is available
if (shouldEnableFeature('new_checkout_flow', $currentUserId)) {
// Route to the new checkout flow (canary version)
include 'checkout_new.php';
} else {
// Route to the existing checkout flow (stable version)
include 'checkout_old.php';
}
NEW_CHECKOUT_ROLLOUT_PERCENTAGE (set to 5, 25, 50, 100) controls the cohort. The crc32($userId) hash is the key detail β without it, a single user would flip between code paths on every request, producing a jarring experience and unreliable telemetry. Sticky cohorts also mean that when an error fires for user X
, you know which code path produced it.
2. Runtime control
The routing decision lives inside the application, not in a configuration server you have to redeploy. That means you can:
- Change the rollout percentage by updating an environment variable or a row in a config table β no new build, no new release.
- Apply rule-based targeting alongside the percentage (internal users first, opt-in beta cohort, a specific region) so you can validate against a friendly audience before opening it up.
- Combine with automatic deployment from Git so the canary code itself ships continuously, and the exposure is the thing you control independently.
3. Gradual exposure
Once monitoring confirms the canary is healthy at 5%, bump it. A typical progression looks like:
0% β 1% β 5% β 25% β 50% β 100%
Watch your error rate, p95 latency, and any business metric tied to the change (conversion rate on a checkout change, message-send success on a notification change) at each step. If you don't have those signals in place, the canary tells you nothing β you've just shipped a slow release.
4. Automated rollback
The biggest payoff of code-centric canaries is that a rollback is just SET rollout_percentage = 0. No redeploy, no DNS flip, no traffic drain β just users routed back to the stable path on the very next request.
For deployment-level rollbacks (config errors, broken build artifacts, infrastructure issues that the canary path can't catch), pair the flag with a tool that ships one-click deployment rollback so the underlying release itself can be reverted just as fast.
5. Database compatibility
The canary pattern assumes both old and new code can talk to the same database. That's not free β it forces a discipline on schema changes:
- Expand-then-contract migrations. Add the new column, dual-write from both code paths, backfill, then drop the old column in a later release once 0% of traffic is on the old path.
- Avoid breaking renames. Rename via
add new + write to both + read from new + drop old
, not via in-place rename. - No destructive migrations during a canary. If you have to run one, do it under a separate change window with all traffic on a single version.
We cover this in depth in our database migration strategies for zero-downtime deployments β required reading before your first canary that touches the schema.
Shifting Traffic at the Load Balancer
Feature flags work great for application-level changes (a new checkout flow, a new pricing engine). But sometimes the change is at the service level β a new container image, a new runtime version, a rewritten endpoint. For those, you shift traffic at the load balancer.
Example: Nginx weighted upstream
http {
upstream backend_service {
# Start with a small percentage for canary (e.g., 5%)
server backend_stable_v1:8080 weight=95;
server backend_canary_v2:8080 weight=5;
}
server {
listen 80;
location / {
proxy_pass http://backend_service;
}
}
}
To progress the rollout, adjust the weight values for the two upstreams (e.g., 95/5 β 75/25 β 0/100) and reload Nginx with nginx -s reload. The reload is graceful β in-flight requests finish on the worker they started on, new requests use the updated weights.
Example: Apache weighted balancer
<VirtualHost *:80>
ServerName your-app.com
# Load necessary modules if not already loaded
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
LoadModule lbmethod_byrequests_module modules/mod_lbmethod_byrequests.so
<Proxy balancer://backend_service>
# Start with a small percentage for canary (e.g., 5%)
BalancerMember http://backend_stable_v1:8080 route=stable loadfactor=95
BalancerMember http://backend_canary_v2:8080 route=canary loadfactor=5
ProxySet lbmethod=byrequests
</Proxy>
ProxyPass / balancer://backend_service/
ProxyPassReverse / balancer://backend_service/
</VirtualHost>
In Apache, loadfactor plays the role of Nginx's weight. Decrease it on the stable upstream and increase it on the canary to expand exposure, then reload Apache.
One caveat with weighted upstreams: they don't preserve user stickiness. A user can flip between the stable and canary versions on every request. For backend services where the response is deterministic for a given input, that's fine. For anything stateful (session-bound features, A/B tests where you need consistent assignment), use the in-code feature-flag approach instead β or layer sticky (Nginx Plus / nchan-helper) or hash-based load balancing on top.
A Working Canary Workflow
Putting the pieces together, a healthy canary release loop looks like this:
- Ship both code paths to production. Old and new live side by side, behind a flag set to
0%. This is just a regular deploy β no traffic shift yet. A Git-based deployment pipeline makes this part boring, which is what you want. - Promote to 1β5%. Turn the flag on for a small slice. Watch error rate, p95 latency, and the business metric tied to the change for 15β60 minutes (longer for low-volume services where statistical signal takes time).
- Verify telemetry, then expand. If the metrics look clean, bump to 25%. Repeat the watch window. Then 50%, then 100%.
- Roll back at the first bad signal. Drop the flag to 0% the moment an alert fires. Don't try to debug at 5% β get back to a known-good state, then investigate. This is the whole point of the pattern.
- Clean up the old code path. Once the new version is at 100% and has been stable for a release cycle, delete the old branch from the code. Long-lived dead branches behind flags are technical debt that quietly compounds.
Canary releases are one piece of the broader continuous deployment practice β the goal is small, frequent, low-risk changes flowing to production with confidence. If you're still working out the line between continuous delivery and continuous deployment, our breakdown of continuous delivery vs continuous deployment covers when each one fits.
Common Canary Pitfalls
A few things that bite teams the first time they try this:
- No baseline. If you don't know what your error rate, latency, and key business metrics look like on a normal day, a canary tells you nothing. Establish baselines before your first canary, not during it.
- Watching the wrong signal. Aggregate dashboards lie during a canary because 95% of traffic is on the stable version. Always slice metrics by version label so you can see canary-only error rate, not blended.
- Too-small canary cohorts. At 1% of a low-traffic service, you might not see any traffic on the canary for hours. Start higher (5β10%) if your volume is modest.
- Long-lived canaries. A canary that's been at 25% for three weeks isn't a canary β it's a fork. Either promote it or kill it.
- Database changes done backwards. A canary that needs a schema change the stable version can't read is a tripwire for an outage. Expand-then-contract every time.
Ready to Ship Canaries on Your Stack?
DeployHQ handles the deployment side of this loop β automated builds, configurable build pipelines for branching strategies, and instant rollbacks when a canary tells you to back out. The feature-flag dial sits in your application; the deploy machinery sits in DeployHQ. Try DeployHQ free and ship your first canary release with a safety net.
Further Resources
- Martin Fowler β canary release pattern: martinfowler.com/bliki/CanaryRelease.html
- Martin Fowler β feature toggles (feature flags): martinfowler.com/articles/featureToggles.html
- The New Stack β progressive delivery: thenewstack.io/progressive-delivery-the-next-evolution-of-devops
Questions about setting up canary deployments on your infrastructure? Email us at support@deployhq.com or reach out on X (@deployhq).