Distributed Rate Limiting

This document describes the distributed rate limiting feature, which enables rate limits to be enforced globally across all proxy instances using Redis.

Overview

In multi-instance deployments, each proxy instance would normally maintain its own rate limit counters. This means that total requests can exceed the intended limit by N times (where N is the number of instances). Distributed rate limiting solves this by using Redis as a shared counter store.

Features

Global Rate Limits: Rate limits are enforced across all proxy instances
Sliding Window Algorithm: Accurate rate limiting using time-based windows
Atomic Operations: Uses Redis INCR for thread-safe counter updates
Graceful Fallback: Falls back to in-memory rate limiting when Redis is unavailable
Per-Token Configuration: Supports custom rate limits for specific tokens
Configurable: All settings can be customized via environment variables

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Proxy #1      │     │   Proxy #2      │     │   Proxy #3      │
│ ┌─────────────┐ │     │ ┌─────────────┐ │     │ ┌─────────────┐ │
│ │ Rate Limiter│ │     │ │ Rate Limiter│ │     │ │ Rate Limiter│ │
│ └──────┬──────┘ │     │ └──────┬──────┘ │     │ └──────┬──────┘ │
└────────┼────────┘     └────────┼────────┘     └────────┼────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
                          ┌──────▼──────┐
                          │    Redis    │
                          │  (Counters) │
                          └─────────────┘

Configuration

Environment Variables

Variable	Default	Description
`DISTRIBUTED_RATE_LIMIT_ENABLED`	`false`	Enable Redis-backed distributed rate limiting
`DISTRIBUTED_RATE_LIMIT_PREFIX`	`ratelimit:`	Redis key prefix for rate limit counters
`DISTRIBUTED_RATE_LIMIT_KEY_SECRET`	``	HMAC secret for hashing token IDs in Redis keys (recommended for production)
`DISTRIBUTED_RATE_LIMIT_WINDOW`	`1m`	Sliding window duration (e.g., `30s`, `1m`, `5m`)
`DISTRIBUTED_RATE_LIMIT_MAX`	`60`	Maximum requests per window
`DISTRIBUTED_RATE_LIMIT_FALLBACK`	`true`	Enable fallback to in-memory when Redis unavailable
`REDIS_ADDR`	`localhost:6379`	Redis server address
`REDIS_DB`	`0`	Redis database number

Example Configuration

# Enable distributed rate limiting
export DISTRIBUTED_RATE_LIMIT_ENABLED=true
export REDIS_ADDR=redis:6379

# Security: Hash token IDs in Redis keys (recommended for production)
export DISTRIBUTED_RATE_LIMIT_KEY_SECRET=your-secret-key-here

# Custom rate limit: 100 requests per 30 seconds
export DISTRIBUTED_RATE_LIMIT_WINDOW=30s
export DISTRIBUTED_RATE_LIMIT_MAX=100

# Enable fallback for high availability
export DISTRIBUTED_RATE_LIMIT_FALLBACK=true

How It Works

Sliding Window Algorithm

The rate limiter uses a sliding window counter algorithm:

When a request arrives, calculate the current window start time
Build a Redis key: {prefix}{tokenID}:{windowStart}
Atomically increment the counter using INCR
Set TTL on the key (on first increment)
Check if count exceeds the limit

Key Format

Redis keys follow this format:

ratelimit:{tokenID or hash}:{windowStartUnix}

Without key secret (default):

ratelimit:sk-abc123:1701388800

With key secret (recommended for production):

ratelimit:a1b2c3d4e5f67890:1701388800

When DISTRIBUTED_RATE_LIMIT_KEY_SECRET is configured, token IDs are hashed using HMAC-SHA256, and only the first 16 hex characters of the hash are used. This prevents raw token IDs from being exposed in Redis keys while maintaining deterministic key generation.

TTL Management

Keys automatically expire after the window duration plus one second to handle edge cases at window boundaries.

Fallback Behavior

When Redis is unavailable and fallback is enabled:

Rate limiter detects Redis failure
Switches to in-memory token bucket algorithm
Uses configured fallback rate and capacity
Continues to check Redis availability periodically

When fallback is disabled:

Returns ErrRedisUnavailable error
Requests may be blocked or allowed based on application handling

API Reference

RedisRateLimiter

// Create a new distributed rate limiter
config := token.RedisRateLimiterConfig{
    KeyPrefix:             "ratelimit:",
    DefaultWindowDuration: time.Minute,
    DefaultMaxRequests:    60,
    EnableFallback:        true,
    FallbackRate:          1.0,
    FallbackCapacity:      10,
}
limiter := token.NewRedisRateLimiter(redisClient, config)

// Check if request is allowed
allowed, err := limiter.Allow(ctx, tokenID)

// Get remaining requests
remaining, err := limiter.GetRemainingRequests(ctx, tokenID)

// Set custom limit for a token
limiter.SetTokenLimit(tokenID, 100, time.Minute)

// Reset token usage
err := limiter.ResetTokenUsage(ctx, tokenID)

// Check Redis health
err := limiter.CheckRedisHealth(ctx)

Monitoring

Key Metrics

Monitor these Redis keys to understand rate limiting behavior:

Count of rate limit keys: SCAN 0 MATCH ratelimit:* COUNT 100
(Use SCAN for production monitoring; KEYS is blocking and only safe for development/debugging.)
Current count for a token: GET ratelimit:{tokenID}:{windowStart}

Health Checks

Use CheckRedisHealth() to verify Redis connectivity:

if err := limiter.CheckRedisHealth(ctx); err != nil {
    log.Warn("Redis unavailable for rate limiting", zap.Error(err))
}

Security Considerations

Token ID Hashing

By default, raw token IDs are stored in Redis keys, which could expose sensitive bearer tokens to anyone with Redis access. To prevent this, configure DISTRIBUTED_RATE_LIMIT_KEY_SECRET:

# Generate a secure secret (recommended: 32+ characters)
export DISTRIBUTED_RATE_LIMIT_KEY_SECRET=$(openssl rand -hex 32)

Important notes:

The secret should be generated once and stored securely
All proxy instances must use the same secret
If the secret is rotated, existing rate limit counters will be orphaned (acceptable since they have TTLs)
The hash is deterministic: the same token always maps to the same key

Redis Security

In addition to token ID hashing:

Use Redis authentication (requirepass)
Enable TLS for Redis connections in production
Restrict Redis network access using firewalls
Monitor Redis access logs for suspicious activity

Best Practices

Enable Fallback: Always enable fallback in production for high availability
Monitor Redis: Set up alerts for Redis availability and latency
Tune Window Size: Smaller windows provide more accurate limiting but increase Redis operations
Key Prefix: Use unique prefixes if sharing Redis with other applications
TTL Buffer: The implementation adds 1 second to TTL to handle edge cases
Hash Token IDs: Configure DISTRIBUTED_RATE_LIMIT_KEY_SECRET in production to prevent token exposure

Troubleshooting

Common Issues

Rate limits not enforced globally

Verify DISTRIBUTED_RATE_LIMIT_ENABLED=true
Check Redis connectivity from all instances
Ensure all instances use the same Redis server

Fallback always active

Check Redis connection string
Verify Redis is running and accessible
Check network connectivity

Keys not expiring

This is handled automatically; check Redis EXPIRE configuration
Verify TTL is being set (check CheckRedisHealth())

Migration Guide

From Per-Instance Rate Limiting

Set up Redis (if not already available)
Configure environment variables
Enable distributed rate limiting: DISTRIBUTED_RATE_LIMIT_ENABLED=true
Monitor for any issues
Adjust window and max settings as needed

Rollback

To disable distributed rate limiting:

export DISTRIBUTED_RATE_LIMIT_ENABLED=false

The system will automatically use per-instance rate limiting.