Python Application Servers in 2026: From WSGI to Modern ASGI Solutions

Devops & Infrastructure, Open Source, Python, and Tips & Tricks

Python Application Servers in 2026: From WSGI to Modern ASGI Solutions

The Python web server landscape has matured considerably over the past few years. The shift from synchronous WSGI to asynchronous ASGI is no longer experimental — it is the default for new projects built on frameworks like FastAPI, Starlette, and modern Django. But choosing the right application server still depends on your workload, your framework, and how much operational complexity you are willing to manage.

This guide compares the most widely used Python application servers in production today, with practical configuration examples and deployment patterns.

flowchart LR
    Client["Client"] --> LB["Nginx / Load Balancer"]
    LB --> AS["App Server\n(Gunicorn / Uvicorn / Granian)"]
    AS --> App["Python App\n(Django / FastAPI / Flask)"]
    App --> DB["Database"]
    App --> Cache["Redis / Memcached"]

Traditional WSGI Servers

Gunicorn

Gunicorn remains the default choice for Django, Flask, and other WSGI applications. It has been battle-tested in production for over a decade, and its pre-fork worker model is simple to reason about.

Strengths:

  • Production-proven reliability across thousands of deployments
  • Simple configuration with sensible defaults
  • Excellent process management and graceful restarts
  • Works out of the box with Django and Flask
# gunicorn.conf.py
bind = "0.0.0.0:8000"
workers = 4
worker_class = "sync"
max_requests = 1000
max_requests_jitter = 50

Limitations:

  • No native async support (though you can use uvicorn.workers.UvicornWorker as a bridge)
  • No WebSocket support in sync mode
  • Each worker holds its own memory, so RAM usage scales linearly with worker count

When to use it: If you are running a Django monolith or a Flask API, Gunicorn with sync workers is the battle-tested default. Do not switch to ASGI just because it is newer — unless you need async features, Gunicorn is the right choice.

uWSGI

uWSGI is a full-featured application server that goes far beyond serving Python — it supports multiple languages, protocols, and deployment patterns. That power comes at the cost of complexity.

# uwsgi.ini
[uwsgi]
http = :8000
processes = 4
threads = 2
master = true
vacuum = true
die-on-term = true

Strengths: Multiple protocol support, built-in caching, load balancing, process management.

Limitations: Steep learning curve, complex configuration with hundreds of options, higher memory footprint. The project has also seen reduced maintenance activity in recent years.

When to use it: If you are already running uWSGI and it works, there is no urgent reason to migrate. For new projects, Gunicorn or a modern ASGI server is a simpler starting point.

Modern ASGI Servers

Uvicorn

Uvicorn is the most popular ASGI server and the default recommendation for FastAPI and Starlette applications. It uses uvloop (a fast drop-in replacement for asyncio's event loop) and httptools for HTTP parsing.

import uvicorn

if __name__ == "__main__":
    uvicorn.run(
        "app:app",
        host="0.0.0.0",
        port=8000,
        workers=4,
        log_level="info",
    )

Strengths:

  • High throughput for async workloads
  • Native WebSocket support
  • Low memory footprint (~20MB per worker)
  • Simple configuration and excellent documentation
  • First-class FastAPI integration

Limitations:

  • Process management is basic compared to Gunicorn (many teams run Uvicorn workers under Gunicorn for production process management: gunicorn -k uvicorn.workers.UvicornWorker)

When to use it: Any async Python application. If you are using FastAPI, Uvicorn is the standard choice.

Hypercorn

Hypercorn supports HTTP/1.1, HTTP/2, HTTP/3 (QUIC), and WebSockets. It is the most protocol-complete ASGI server available.

from hypercorn.config import Config
from hypercorn.asyncio import serve

config = Config()
config.bind = ["0.0.0.0:8000"]
config.workers = 4

Strengths: HTTP/3 and QUIC support, multiple worker types (asyncio, uvloop, trio), TLS configuration built-in.

Limitations: Smaller community than Uvicorn, fewer production deployment examples, slightly lower throughput for standard HTTP/1.1 workloads.

When to use it: If you need HTTP/3 or QUIC support, or if you are using the Trio async library instead of asyncio.

Performance-Focused Solutions

Granian

Granian is a Rust-based Python application server that focuses on raw performance. Its HTTP parsing and connection handling are implemented in Rust, while the Python application code runs through standard ASGI/WSGI interfaces.

# Granian is configured via CLI
granian --interface asgi --host 0.0.0.0 --port 8000 --workers 4 --threads 2 app:app

Strengths:

  • Highest throughput in synthetic benchmarks due to Rust-based I/O handling
  • Low memory footprint (~15MB per worker)
  • Supports both ASGI and RSGI (Granian's own interface for maximum performance)
  • Multi-protocol support

Limitations:

  • Newer project with a smaller community
  • Fewer production case studies available
  • Debugging can be harder when issues cross the Rust/Python boundary

When to use it: If you have benchmarked your specific workload and confirmed that the application server is the bottleneck — not your database, network, or application code (which is the case in most real applications).

Performance: A Realistic Perspective

Synthetic benchmarks (like hello world req/sec) are widely published but rarely reflect real-world performance. In practice:

  • ASGI servers are 2-4x faster than WSGI servers for async workloads with high concurrency (many simultaneous connections, WebSocket streams, long-polling)
  • For traditional request-response APIs, the difference between Gunicorn and Uvicorn narrows significantly because your database queries and business logic dominate response time
  • Granian's Rust-based I/O shows the largest gains in connection-heavy scenarios with minimal application logic

Rather than chasing benchmark numbers, profile your actual application. Tools like py-spy, cProfile, or OpenTelemetry traces will show you where time is actually spent.

Approximate memory per worker:

Server Base Memory
Granian ~15MB
Uvicorn ~20MB
Hypercorn ~25MB
Gunicorn ~30MB
uWSGI ~40MB

These are baseline figures. Your application's imports, data structures, and caches will add to this.

Modern Features Implementation

WebSocket Support (FastAPI Example)

from fastapi import FastAPI, WebSocket

app = FastAPI()

@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    while True:
        data = await websocket.receive_text()
        await websocket.send_text(f"Message received: {data}")

HTTP/2 Configuration

# Hypercorn HTTP/2 config
config = Config()
config.h2_enabled = True
config.alpn_protocols = ["h2", "http/1.1"]
config.certfile = "cert.pem"
config.keyfile = "key.pem"

Production Deployment

Docker Configuration

FROM python:3.13-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-app
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: python-app
        image: python-app:1.0
        resources:
          limits:
            memory: "512Mi"
            cpu: "500m"
        readinessProbe:
          httpGet:
            path: /health
            port: 8000

If you are deploying with Docker or Kubernetes, you will need a Shell Server in DeployHQ to manage the deployment pipeline. DeployHQ connects to your server via SSH and runs the commands you define — pulling new images, restarting containers, or running migrations.

Performance Optimization

Worker Configuration

# Gunicorn worker optimization
import multiprocessing

# Workers = 2 * CPU cores + 1
workers = multiprocessing.cpu_count() * 2 + 1
threads = 4
worker_class = "uvicorn.workers.UvicornWorker"

This hybrid approach gives you Gunicorn's process management with Uvicorn's async performance — a common production pattern for FastAPI applications.

Connection Pooling

# Database connection pooling with async
from databases import Database

database = Database("postgresql://user:pass@localhost/db")

async def startup():
    await database.connect()

async def shutdown():
    await database.disconnect()

Monitoring and Observability

Prometheus Metrics

from prometheus_client import Counter, Histogram
from fastapi import FastAPI
from prometheus_fastapi_instrumentator import Instrumentator

app = FastAPI()
Instrumentator().instrument(app).expose(app)

REQUEST_COUNT = Counter(
    "http_requests_total",
    "Total HTTP requests",
    ["method", "endpoint", "status"]
)

OpenTelemetry Integration

For a deeper dive into setting up metrics, traces, and logs with OpenTelemetry, see our guide on OpenTelemetry in practice.

from opentelemetry import trace
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.sdk.trace import TracerProvider

trace.set_tracer_provider(TracerProvider())
FastAPIInstrumentor.instrument_app(app)

Deploying Python Apps with DeployHQ

DeployHQ can deploy any Python application server setup — whether you are using Gunicorn behind Nginx on a VPS or running containers on Kubernetes. Here is a typical workflow for a FastAPI + Uvicorn application:

  1. Connect your repositoryDeployHQ supports GitHub, GitLab, Bitbucket, and self-hosted Git servers
  2. Configure a build pipeline — install dependencies with pip install -r requirements.txt, run tests, and build any assets
  3. Set up SSH commands — after file transfer, restart your application server (e.g., sudo systemctl restart myapp)
  4. Deploy — push to your branch and DeployHQ handles the rest, with detailed logs and rollback capability

For Django-specific deployments, see our guide on deploying Django on a budget with Hetzner and DeployHQ. If you are deploying Python ERP platforms, check out our Odoo deployment guide.

Deployment Considerations

Process Management

# Supervisor config
[program:python-app]
command=uvicorn app:app --host 0.0.0.0 --port 8000
directory=/app
user=www-data
autostart=true
autorestart=true
stopasgroup=true
killasgroup=true

For systemd-based servers (most modern Linux distributions), a unit file is often simpler:

# /etc/systemd/system/myapp.service
[Unit]
Description=Python App
After=network.target

[Service]
User=www-data
WorkingDirectory=/app
ExecStart=/app/venv/bin/uvicorn app:app --host 0.0.0.0 --port 8000
Restart=always

[Install]
WantedBy=multi-user.target

Load Balancing

# Nginx configuration
upstream python_servers {
    server 127.0.0.1:8000;
    server 127.0.0.1:8001;
    server 127.0.0.1:8002;
}

server {
    listen 80;
    location / {
        proxy_pass http://python_servers;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Conclusion

The right Python application server depends on what you are building. For traditional Django and Flask applications, Gunicorn remains the reliable default. For async applications built on FastAPI or Starlette, Uvicorn is the standard. Granian and Hypercorn serve more specialised needs — maximum throughput and advanced protocol support respectively.

Whatever server you choose, DeployHQ makes it straightforward to set up automated deployments from your Git repository to your servers. Sign up for a free trial and have your Python application deploying in minutes.


Have questions about deploying Python applications? Reach out to us at support@deployhq.com or find us on Twitter/X.