Distributed Rate Limiting
This document describes the distributed rate limiting feature, which enables rate limits to be enforced globally across all proxy instances using Redis.
Overview
In multi-instance deployments, each proxy instance would normally maintain its own rate limit counters. This means that total requests can exceed the intended limit by N times (where N is the number of instances). Distributed rate limiting solves this by using Redis as a shared counter store.
Features
- Global Rate Limits: Rate limits are enforced across all proxy instances
- Sliding Window Algorithm: Accurate rate limiting using time-based windows
- Atomic Operations: Uses Redis INCR for thread-safe counter updates
- Graceful Fallback: Falls back to in-memory rate limiting when Redis is unavailable
- Per-Token Configuration: Supports custom rate limits for specific tokens
- Configurable: All settings can be customized via environment variables
Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Proxy #1 │ │ Proxy #2 │ │ Proxy #3 │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Rate Limiter│ │ │ │ Rate Limiter│ │ │ │ Rate Limiter│ │
│ └──────┬──────┘ │ │ └──────┬──────┘ │ │ └──────┬──────┘ │
└────────┼────────┘ └────────┼────────┘ └────────┼────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌──────▼──────┐
│ Redis │
│ (Counters) │
└─────────────┘
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
DISTRIBUTED_RATE_LIMIT_ENABLED |
false |
Enable Redis-backed distributed rate limiting |
DISTRIBUTED_RATE_LIMIT_PREFIX |
ratelimit: |
Redis key prefix for rate limit counters |
DISTRIBUTED_RATE_LIMIT_KEY_SECRET |
`` | HMAC secret for hashing token IDs in Redis keys (recommended for production) |
DISTRIBUTED_RATE_LIMIT_WINDOW |
1m |
Sliding window duration (e.g., 30s, 1m, 5m) |
DISTRIBUTED_RATE_LIMIT_MAX |
60 |
Maximum requests per window |
DISTRIBUTED_RATE_LIMIT_FALLBACK |
true |
Enable fallback to in-memory when Redis unavailable |
REDIS_ADDR |
localhost:6379 |
Redis server address |
REDIS_DB |
0 |
Redis database number |
Example Configuration
# Enable distributed rate limiting
export DISTRIBUTED_RATE_LIMIT_ENABLED=true
export REDIS_ADDR=redis:6379
# Security: Hash token IDs in Redis keys (recommended for production)
export DISTRIBUTED_RATE_LIMIT_KEY_SECRET=your-secret-key-here
# Custom rate limit: 100 requests per 30 seconds
export DISTRIBUTED_RATE_LIMIT_WINDOW=30s
export DISTRIBUTED_RATE_LIMIT_MAX=100
# Enable fallback for high availability
export DISTRIBUTED_RATE_LIMIT_FALLBACK=true
How It Works
Sliding Window Algorithm
The rate limiter uses a sliding window counter algorithm:
- When a request arrives, calculate the current window start time
- Build a Redis key:
{prefix}{tokenID}:{windowStart} - Atomically increment the counter using
INCR - Set TTL on the key (on first increment)
- Check if count exceeds the limit
Key Format
Redis keys follow this format:
ratelimit:{tokenID or hash}:{windowStartUnix}
Without key secret (default):
ratelimit:sk-abc123:1701388800
With key secret (recommended for production):
ratelimit:a1b2c3d4e5f67890:1701388800
When DISTRIBUTED_RATE_LIMIT_KEY_SECRET is configured, token IDs are hashed using HMAC-SHA256, and only the first 16 hex characters of the hash are used. This prevents raw token IDs from being exposed in Redis keys while maintaining deterministic key generation.
TTL Management
Keys automatically expire after the window duration plus one second to handle edge cases at window boundaries.
Fallback Behavior
When Redis is unavailable and fallback is enabled:
- Rate limiter detects Redis failure
- Switches to in-memory token bucket algorithm
- Uses configured fallback rate and capacity
- Continues to check Redis availability periodically
When fallback is disabled:
- Returns
ErrRedisUnavailableerror - Requests may be blocked or allowed based on application handling
API Reference
RedisRateLimiter
// Create a new distributed rate limiter
config := token.RedisRateLimiterConfig{
KeyPrefix: "ratelimit:",
DefaultWindowDuration: time.Minute,
DefaultMaxRequests: 60,
EnableFallback: true,
FallbackRate: 1.0,
FallbackCapacity: 10,
}
limiter := token.NewRedisRateLimiter(redisClient, config)
// Check if request is allowed
allowed, err := limiter.Allow(ctx, tokenID)
// Get remaining requests
remaining, err := limiter.GetRemainingRequests(ctx, tokenID)
// Set custom limit for a token
limiter.SetTokenLimit(tokenID, 100, time.Minute)
// Reset token usage
err := limiter.ResetTokenUsage(ctx, tokenID)
// Check Redis health
err := limiter.CheckRedisHealth(ctx)
Monitoring
Key Metrics
Monitor these Redis keys to understand rate limiting behavior:
- Count of rate limit keys:
SCAN 0 MATCH ratelimit:* COUNT 100
(UseSCANfor production monitoring;KEYSis blocking and only safe for development/debugging.) - Current count for a token:
GET ratelimit:{tokenID}:{windowStart}
Health Checks
Use CheckRedisHealth() to verify Redis connectivity:
if err := limiter.CheckRedisHealth(ctx); err != nil {
log.Warn("Redis unavailable for rate limiting", zap.Error(err))
}
Security Considerations
Token ID Hashing
By default, raw token IDs are stored in Redis keys, which could expose sensitive bearer tokens to anyone with Redis access. To prevent this, configure DISTRIBUTED_RATE_LIMIT_KEY_SECRET:
# Generate a secure secret (recommended: 32+ characters)
export DISTRIBUTED_RATE_LIMIT_KEY_SECRET=$(openssl rand -hex 32)
Important notes:
- The secret should be generated once and stored securely
- All proxy instances must use the same secret
- If the secret is rotated, existing rate limit counters will be orphaned (acceptable since they have TTLs)
- The hash is deterministic: the same token always maps to the same key
Redis Security
In addition to token ID hashing:
- Use Redis authentication (
requirepass) - Enable TLS for Redis connections in production
- Restrict Redis network access using firewalls
- Monitor Redis access logs for suspicious activity
Best Practices
- Enable Fallback: Always enable fallback in production for high availability
- Monitor Redis: Set up alerts for Redis availability and latency
- Tune Window Size: Smaller windows provide more accurate limiting but increase Redis operations
- Key Prefix: Use unique prefixes if sharing Redis with other applications
- TTL Buffer: The implementation adds 1 second to TTL to handle edge cases
- Hash Token IDs: Configure
DISTRIBUTED_RATE_LIMIT_KEY_SECRETin production to prevent token exposure
Troubleshooting
Common Issues
Rate limits not enforced globally
- Verify
DISTRIBUTED_RATE_LIMIT_ENABLED=true - Check Redis connectivity from all instances
- Ensure all instances use the same Redis server
Fallback always active
- Check Redis connection string
- Verify Redis is running and accessible
- Check network connectivity
Keys not expiring
- This is handled automatically; check Redis EXPIRE configuration
- Verify TTL is being set (check
CheckRedisHealth())
Migration Guide
From Per-Instance Rate Limiting
- Set up Redis (if not already available)
- Configure environment variables
- Enable distributed rate limiting:
DISTRIBUTED_RATE_LIMIT_ENABLED=true - Monitor for any issues
- Adjust window and max settings as needed
Rollback
To disable distributed rate limiting:
export DISTRIBUTED_RATE_LIMIT_ENABLED=false
The system will automatically use per-instance rate limiting.