Transparent Proxy Design Decisions
This document explains the key design decisions for the transparent LLM proxy architecture.
Key Design Principles
- Maximum Transparency
- Proxy modifies only what’s absolutely necessary (authorization header)
- All other request/response data passes through unchanged
- No API-specific client/SDK dependencies
- Universal Compatibility
- Works with any API, not just OpenAI
- Configurable target URL and endpoint allowlist
- Generic handling of all HTTP methods and content types
- High Performance
- Optimized for minimum latency overhead
- Connection pooling for HTTP clients
- Efficient buffer management
- Stream processing optimized for SSE
- Strong Security
- Token validation and expiration
- Rate limiting
- Whitelist approach for allowed endpoints
- Protection of API keys
Architecture Design
httputil.ReverseProxy as Foundation
We chose Go’s built-in httputil.ReverseProxy as the foundation for our proxy for several reasons:
- Production-Tested Code: Part of Go’s standard library and widely used in production
- Full HTTP Support: Handles all HTTP methods, headers, status codes
- Streaming Support: Properly handles chunked transfer encoding and streaming responses
- Customization Points: The Director, ModifyResponse, and ErrorHandler functions provide optimal customization points
Middleware-Based Processing
Rather than building an API client, we implemented a middleware chain approach:
- Separation of Concerns: Each middleware handles a specific aspect (logging, validation, etc.)
- Testability: Each middleware can be tested independently
- Flexibility: Middlewares can be added/removed based on configuration
- Consistent Error Handling: Common error format across all processing stages
Design Patterns Used
- Chain of Responsibility: Middleware chain processes requests in sequence
- Decorator: Each middleware decorates the HTTP handler with additional functionality
- Strategy: Different validation and authentication strategies can be plugged in
- Adapter: Converts external token validation to HTTP middleware
Key Components
TransparentProxy
The core proxy component that orchestrates request/response handling:
- Configures and initializes the
httputil.ReverseProxy - Registers middleware chain
- Handles token validation and API key substitution
- Manages connection pooling and timeout settings
Middleware Stack
Middleware functions that process requests before they reach the proxy:
- LoggingMiddleware: Logs request details with timing information
- ValidateRequestMiddleware: Ensures requests target allowed endpoints with allowed methods
- TimeoutMiddleware: Adds a context timeout to limit request duration
- MetricsMiddleware: Collects performance metrics on requests
Request Processing Flow
- Middleware Chain: Request passes through middleware for logging, validation, etc.
- Director Function: Updates request URL, performs token validation, replaces authorization header
- HTTP Transport: Forwards request to target API
- ModifyResponse Function: Processes response (adds headers, extracts metadata)
- Response Return: Returns response to client
Streaming Support
Special consideration was given to properly support Server-Sent Events (SSE):
- FlushInterval Configuration: Short interval ensures chunks are sent promptly
- No Buffering: Streaming responses bypass in-memory buffers
- Content-Type Detection: Automatically detects text/event-stream
- Transfer-Encoding Support: Properly handles chunked transfer encoding
Error Handling
Centralized error handling with consistent response format:
- Validation Errors: Token validation failures return appropriate status codes and error messages
- Proxy Errors: Network issues, timeouts, etc. are mapped to appropriate HTTP status codes
- JSON Error Format: Consistent error response structure with error code and description
Why Not Use an API Client?
We explicitly chose not to build an API client library for several reasons:
- Transparency: A client library would need to understand API-specific details and formats
- Flexibility: Supporting all possible API parameters would be difficult and maintenance-heavy
- Performance: Direct proxying avoids extra parsing/serialization overhead
- Future-Proofing: APIs evolve; a transparent proxy automatically supports new endpoints/parameters
- Streaming Support: Direct proxy streaming is more efficient than client library handling
Configuration Flexibility
The proxy is designed to be highly configurable:
- Target API: Change base URL in configuration to proxy to any API
- Allowed Endpoints/Methods: Configure security restrictions
- Timeout Settings: Control request timeouts, response header timeouts, and streaming flush intervals
- Connection Pooling: Fine-tune connection settings for optimal performance
Testing Strategy
The proxy is thoroughly tested with several types of tests:
- Unit Tests: For individual components and middlewares
- Integration Tests: Testing the full request/response flow
- Streaming Tests: Special tests for SSE handling
- Performance Tests: Benchmarks for latency and throughput
- Error Handling Tests: Tests for various error conditions
Conclusion
Our transparent proxy design prioritizes maximum transparency, minimal overhead, and generic applicability. By building directly on the httputil.ReverseProxy with a middleware chain, we achieve high performance, strong security, and API flexibility.