Instrumentation Middleware: Usage & Extension Guide
Overview
The async instrumentation middleware provides non-blocking, streaming-capable instrumentation for all API calls handled by the LLM Proxy. It captures request/response metadata and emits events to a pluggable event bus for downstream processing (e.g., file, cloud, analytics).
Note: Instrumentation middleware and audit logging serve different purposes:
- Instrumentation: Captures API request/response metadata for observability and analytics
- Audit Logging: Records security-sensitive operations for compliance and investigations (see Audit Events)
Both systems operate independently and can be configured separately.
Event Bus & Dispatcher Architecture
- The async event bus is now always enabled and handles all API instrumentation events.
- The event bus supports multiple subscribers (fan-out), batching, retry logic, and graceful shutdown.
- Both in-memory and Redis backends are available for local and distributed event delivery.
- Persistent event logging is handled by a dispatcher CLI or the
--file-event-logflag on the server, which writes events to a JSONL file. - Middleware captures and restores the request body for all events, and the event context is richer for diagnostics and debugging.
Relationship to Audit Logging
The instrumentation event bus is separate from the audit logging system:
- Event Bus: Captures API request/response data for observability (instrumentation middleware)
- Audit Logger: Records security events directly to file/database (audit middleware)
Both systems can run simultaneously:
- Instrumentation events flow through the event bus to dispatchers
- Audit events are written directly to audit logs (file and/or database)
- No overlap in captured data - instrumentation focuses on API performance, audit focuses on security events
Audit Events
The proxy emits audit events for security-sensitive operations:
Proxy Request Audit Events
- Project Inactive (403): When a request is denied due to inactive project status
- Action:
proxy_request, Result:denied, Reason:project_inactive - Includes: project ID, token ID, client IP, user agent, HTTP method, endpoint
- Action:
- Service Unavailable (503): When project status check fails due to database errors
- Action:
proxy_request, Result:error, Reason:service_unavailable - Includes: error details, project ID, request metadata
- Action:
Management API Audit Events
- Project Lifecycle: Create, update (including
is_activechanges), delete operations - Token Management: Create, update, revoke (single and batch operations)
- All events include actor identification, request IDs, and operation metadata
Audit events are stored in the database and written to audit log files for compliance and security investigations.
For complete system observability, both should be enabled in production environments.
Persistent Event Logging
- To persist all events to a file, use the
--file-event-logflag when running the server:
llm-proxy server --file-event-log ./data/events.jsonl
- Alternatively, use the standalone dispatcher CLI to subscribe to the event bus and write events to a file or other backends:
llm-proxy dispatcher --backend file --file ./data/events.jsonl
Configuration Reference
OBSERVABILITY_ENABLED: Deprecated; the async event bus is always enabled.OBSERVABILITY_BUFFER_SIZE(int): Buffer size for event bus (default: 1000)OBSERVABILITY_MAX_REQUEST_BODY_BYTES(int64): Max bytes of request body captured into observability events (default: 65536). Does not affect proxying.OBSERVABILITY_MAX_RESPONSE_BODY_BYTES(int64): Max bytes of response body captured into observability events (default: 262144). Does not affect proxying.FILE_EVENT_LOG: Path to persistent event log file (enables file event logging via dispatcher)
Hot-Path Performance Tuning (Non-Observability)
These settings primarily affect hot-path performance characteristics rather than core observability semantics:
LLM_PROXY_API_KEY_CACHE_TTL(duration): TTL for per-project upstream API key cache (default: 30s).LLM_PROXY_API_KEY_CACHE_MAX(int): Max entries for per-project upstream API key cache (default: 10000).OBSERVABILITY_MAX_RESPONSE_BODY_BYTES(int64): Cap bytes captured from response bodies for observability events (also bounds OpenAI metadata extraction). Default: 262144.
How It Works
- The middleware wraps all proxy requests and responses.
- Captures request ID, method, path, status, duration, headers, and full (streamed) response body.
- Emits an event to the async event bus (in-memory or Redis).
- Event delivery is fully async, non-blocking, batched, and resilient to failures.
Event Bus Backends
- Redis Streams (
redis-streams): Recommended for production. Provides consumer groups, acknowledgment, at-least-once delivery, and crash recovery. See Redis Streams Backend. - In-Memory (
in-memory): Fast, simple, for local/dev use. Single process only. No durability or delivery guarantees. - Custom: Implement the
EventBusinterface for other backends (Kafka, HTTP, etc.).
Event Schema Example
// eventbus.Event
Event {
RequestID string
Method string
Path string
Status int
Duration time.Duration
ResponseHeaders http.Header
ResponseBody []byte
}
Example: Enabling Persistent Logging in Docker
docker run -d \
-e FILE_EVENT_LOG=./data/events.jsonl \
...
Extending the Middleware
- Custom Event Schema: Extend
eventbus.Eventor create your own struct. Update the middleware to emit your custom event type. - New Event Bus Backends: Implement the
EventBusinterface (seeinternal/eventbus/eventbus.go). Plug in your backend (e.g., Redis, Kafka, HTTP, etc.). - New Consumers/Dispatchers: Write a dispatcher that subscribes to the event bus and delivers events to your backend (file, cloud, analytics, etc.).
Example: Custom EventBus Backend
type MyEventBus struct { /* ... */ }
func (b *MyEventBus) Publish(ctx context.Context, evt eventbus.Event) { /* ... */ }
func (b *MyEventBus) Subscribe() <-chan eventbus.Event { /* ... */ }
Dispatcher CLI Commands
The LLM Proxy now includes a powerful, pluggable dispatcher system for sending observability events to external services. The dispatcher supports multiple backends and can be run as a separate service.
Available Backends
- file: Write events to JSONL file
- lunary: Send events to Lunary.ai platform
- helicone: Send events to Helicone platform
Basic Usage
# File output
llm-proxy dispatcher --service file --endpoint events.jsonl
# Lunary integration
llm-proxy dispatcher --service lunary --api-key $LUNARY_API_KEY
# Helicone integration
llm-proxy dispatcher --service helicone --api-key $HELICONE_API_KEY
# Custom endpoint for Lunary
llm-proxy dispatcher --service lunary --api-key $LUNARY_API_KEY --endpoint https://custom.lunary.ai/v1/runs/ingest
Configuration Options
| Flag | Default | Description |
|---|---|---|
--service |
file |
Backend service (file, lunary, helicone) |
--endpoint |
service-specific | API endpoint or file path |
--api-key |
- | API key for external services |
--buffer |
1000 |
Event bus buffer size |
--batch-size |
100 |
Batch size for sending events |
--detach |
false |
Run in background (daemon mode) |
Environment Variables
LLM_PROXY_API_KEY: API key for the selected serviceLLM_PROXY_ENDPOINT: Default endpoint URL
Event Format
The dispatcher transforms internal events into a rich format suitable for external services:
{
"type": "llm",
"event": "start",
"runId": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2023-12-01T10:00:00Z",
"input": {"model": "gpt-4", "messages": [...]},
"output": {"choices": [...]},
"metadata": {
"method": "POST",
"path": "/v1/chat/completions",
"status": 200,
"duration_ms": 1234,
"request_id": "req-123"
}
}
Advanced Features
- Automatic Retry: Exponential backoff for failed requests
- Batching: Configurable batch sizes for efficiency
- Graceful Shutdown: SIGINT/SIGTERM handling
- Extensible: Easy to add new backends
Helicone Manual Logger Integration
The Helicone dispatcher plugin transforms LLM Proxy events into Helicone’s Manual Logger format. This enables detailed cost tracking, analytics, and monitoring of custom model endpoints through Helicone.
Payload Mapping Details
Our implementation maps LLM Proxy events to the Helicone Manual Logger format as follows:
{
"providerRequest": {
"url": "/v1/chat/completions",
"json": { "model": "gpt-4", "messages": [...] },
"meta": {
"Helicone-Provider": "openai",
"Helicone-User-Id": "user-123",
"request_id": "req-456",
"provider": "openai"
}
},
"providerResponse": {
"status": 200,
"headers": {},
"json": { "choices": [...], "usage": {...} },
"base64": "..." // for non-JSON responses
},
"timing": {
"startTime": { "seconds": 1640995200, "milliseconds": 0 },
"endTime": { "seconds": 1640995201, "milliseconds": 250 }
}
}
Key Features
Provider Detection: Automatically sets Helicone-Provider header to prevent categorization as “CUSTOM” model, enabling proper cost calculation.
Usage Injection: Injects computed token usage into response JSON when available:
{
"usage": {
"prompt_tokens": 10,
"completion_tokens": 25,
"total_tokens": 35
}
}
Request ID Propagation: Preserves request_id from middleware context for correlation.
Non-JSON Response Handling: For binary or non-JSON responses:
- Sets
providerResponse.jsonto empty object with explanatory note - Includes
base64field for binary data when available
Metadata Enrichment: Forwards relevant metadata fields and user properties to Helicone headers.
Configuration
# Basic usage
llm-proxy dispatcher --service helicone --api-key $HELICONE_API_KEY
# Custom endpoint (e.g., for EU region)
llm-proxy dispatcher --service helicone \
--api-key $HELICONE_API_KEY \
--endpoint https://eu.api.helicone.ai/custom/v1/log
References
- Helicone Manual Logger Documentation
- Implementation:
heliconePayloadFromEventfunction - Tests: Payload transformation examples
HTTP Response Caching Integration
The proxy includes HTTP response caching that integrates with the instrumentation and observability system. Caching behavior affects both response headers and event publishing.
Cache Response Headers
When caching is enabled (HTTP_CACHE_ENABLED=true), the proxy adds observability headers to all responses:
X-PROXY-CACHE: Indicates cache resulthit: Response served from cachemiss: Response not in cache, fetched from upstream
X-PROXY-CACHE-KEY: The cache key used for the request (useful for debugging cache behavior)Cache-Status: Standard HTTP cache status headerhit: Cache hit, response served from cachemiss: Cache miss, response fetched from upstreambypass: Caching bypassed (e.g., due toCache-Control: no-store)stored: Response was stored in cache after fetchconditional-hit: Conditional request (e.g.,If-None-Match) resulted in 304
Cache Metrics
The proxy keeps lightweight, provider-agnostic counters to assess cache effectiveness:
cache_hits_total: Number of requests served from cache (including conditional hits)cache_misses_total: Number of requests that missed the cachecache_bypass_total: Number of requests where caching was bypassed (e.g.,no-store)cache_store_total: Number of responses stored in cache after upstream fetch
Notes:
- Counters are in-memory and surfaced via the existing metrics endpoint when enabled.
- No external metrics provider is required; Prometheus export is optional and not a core dependency.
Event Bus Behavior with Caching
The caching system integrates with the instrumentation middleware to optimize performance:
- Cache Hits: Events are not published to the event bus for cache hits (including conditional hits). This prevents duplicate instrumentation data and reduces event bus load.
- Cache Misses and Stores: Events are published normally when responses are fetched from upstream, whether they get cached or not.
This behavior ensures that:
- Each unique API call is instrumented exactly once (when first fetched)
- Cache performance doesn’t impact event bus throughput
- Downstream analytics systems receive clean, non-duplicated data
Example Headers
# Cache hit response
HTTP/1.1 200 OK
X-PROXY-CACHE: hit
X-PROXY-CACHE-KEY: llmproxy:cache:proj123:GET:/v1/models:accept-application/json
Cache-Status: hit
Content-Type: application/json
# Cache miss response
HTTP/1.1 200 OK
X-PROXY-CACHE: miss
X-PROXY-CACHE-KEY: llmproxy:cache:proj123:POST:/v1/chat/completions:accept-application/json:body-hash-abc123
Cache-Status: stored
Content-Type: application/json
Debugging Cache Behavior
Use the benchmark tool with --debug flag to inspect cache headers:
llm-proxy benchmark \
--base-url "http://localhost:8080" \
--endpoint "/v1/chat/completions" \
--token "$PROXY_TOKEN" \
--requests 10 --concurrency 1 \
--cache \
--debug \
--json '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"test"}]}'
This will show sample responses with all headers, making it easy to verify cache behavior.
Prometheus Metrics Endpoint
The proxy provides an additional Prometheus-compatible metrics endpoint for monitoring and alerting. This endpoint complements the existing JSON metrics endpoint without replacing it.
Endpoints
/metrics: Provider-agnostic JSON metrics (default format)/metrics/prometheus: Prometheus text exposition format
Both endpoints are enabled when ENABLE_METRICS=true (default).
Both endpoints are available when ENABLE_METRICS=true (default).
Available Metrics
The Prometheus endpoint exposes the following metrics:
Application Metrics
| Metric | Type | Description |
|---|---|---|
llm_proxy_uptime_seconds |
gauge | Time since the server started |
llm_proxy_requests_total |
counter | Total number of proxy requests |
llm_proxy_errors_total |
counter | Total number of proxy errors |
llm_proxy_cache_hits_total |
counter | Total number of cache hits |
llm_proxy_cache_misses_total |
counter | Total number of cache misses |
llm_proxy_cache_bypass_total |
counter | Total number of cache bypasses |
llm_proxy_cache_stores_total |
counter | Total number of cache stores |
Go Runtime Metrics
| Metric | Type | Description |
|---|---|---|
llm_proxy_goroutines |
gauge | Number of goroutines currently running |
llm_proxy_memory_heap_alloc_bytes |
gauge | Number of heap bytes allocated and currently in use |
llm_proxy_memory_heap_sys_bytes |
gauge | Number of heap bytes obtained from the OS |
llm_proxy_memory_heap_idle_bytes |
gauge | Number of heap bytes waiting to be used |
llm_proxy_memory_heap_inuse_bytes |
gauge | Number of heap bytes that are in use |
llm_proxy_memory_heap_released_bytes |
gauge | Number of heap bytes released to the OS |
llm_proxy_memory_total_alloc_bytes |
counter | Total number of bytes allocated (cumulative) |
llm_proxy_memory_sys_bytes |
gauge | Total number of bytes obtained from the OS |
llm_proxy_memory_mallocs_total |
counter | Total number of malloc operations |
llm_proxy_memory_frees_total |
counter | Total number of free operations |
llm_proxy_gc_runs_total |
counter | Total number of GC runs |
llm_proxy_gc_pause_total_seconds |
counter | Total GC pause time in seconds |
llm_proxy_gc_next_bytes |
gauge | Target heap size for next GC cycle |
Example Output
# HELP llm_proxy_uptime_seconds Time since the server started
# TYPE llm_proxy_uptime_seconds gauge
llm_proxy_uptime_seconds 3542.12
# HELP llm_proxy_requests_total Total number of proxy requests
# TYPE llm_proxy_requests_total counter
llm_proxy_requests_total 1523
# HELP llm_proxy_errors_total Total number of proxy errors
# TYPE llm_proxy_errors_total counter
llm_proxy_errors_total 12
# HELP llm_proxy_cache_hits_total Total number of cache hits
# TYPE llm_proxy_cache_hits_total counter
llm_proxy_cache_hits_total 842
# HELP llm_proxy_cache_misses_total Total number of cache misses
# TYPE llm_proxy_cache_misses_total counter
llm_proxy_cache_misses_total 681
# HELP llm_proxy_cache_bypass_total Total number of cache bypasses
# TYPE llm_proxy_cache_bypass_total counter
llm_proxy_cache_bypass_total 0
# HELP llm_proxy_cache_stores_total Total number of cache stores
# TYPE llm_proxy_cache_stores_total counter
llm_proxy_cache_stores_total 681
# HELP llm_proxy_goroutines Number of goroutines currently running
# TYPE llm_proxy_goroutines gauge
llm_proxy_goroutines 12
# HELP llm_proxy_memory_heap_alloc_bytes Number of heap bytes allocated and currently in use
# TYPE llm_proxy_memory_heap_alloc_bytes gauge
llm_proxy_memory_heap_alloc_bytes 2097152
# HELP llm_proxy_memory_total_alloc_bytes Total number of bytes allocated (cumulative)
# TYPE llm_proxy_memory_total_alloc_bytes counter
llm_proxy_memory_total_alloc_bytes 104857600
# HELP llm_proxy_gc_runs_total Total number of GC runs
# TYPE llm_proxy_gc_runs_total counter
llm_proxy_gc_runs_total 42
Prometheus Scrape Configuration
Add the following to your Prometheus configuration:
scrape_configs:
- job_name: 'llm-proxy'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/metrics/prometheus'
scrape_interval: 15s
Example Queries
# Request rate (per second)
rate(llm_proxy_requests_total[5m])
# Error rate
rate(llm_proxy_errors_total[5m]) / rate(llm_proxy_requests_total[5m])
# Cache hit ratio
llm_proxy_cache_hits_total / (llm_proxy_cache_hits_total + llm_proxy_cache_misses_total)
# Total uptime in hours
llm_proxy_uptime_seconds / 3600
# Memory usage trend
rate(llm_proxy_memory_total_alloc_bytes[5m])
# Heap allocation
llm_proxy_memory_heap_alloc_bytes
# GC frequency
rate(llm_proxy_gc_runs_total[5m])
# Active goroutines
llm_proxy_goroutines
Testing
# Check Prometheus metrics
curl http://localhost:8080/metrics/prometheus
# Compare with JSON format
curl http://localhost:8080/metrics | jq .
Grafana Dashboard
A ready-to-import Grafana dashboard is available for visualizing LLM Proxy metrics:
- Dashboard JSON:
deploy/helm/llm-proxy/dashboards/llm-proxy.json - Documentation: See the dashboards README for import instructions
The dashboard includes panels for:
- Request rate, error rate, and uptime
- Cache performance (hits, misses, bypass, stores)
- Memory usage and Go runtime metrics
- Garbage collection statistics
Import the dashboard into Grafana and configure it to use your Prometheus datasource.
Notes
- The Prometheus endpoint is lightweight and has no external dependencies
- Metrics are in-memory and reset on server restart
- Both JSON and Prometheus endpoints can be used simultaneously
- No secrets are exposed in metrics output
Important: In-Memory vs. Redis Event Bus
- The in-memory event bus only works within a single process. If you run the proxy and dispatcher as separate processes or containers, they will not share events.
- For distributed, multi-process, or containerized setups, Redis is required as the event bus backend.
Local Redis Setup for Manual Testing
Add the following to your docker-compose.yml to run Redis locally:
redis:
image: redis:7
container_name: llm-proxy-redis
ports:
- "6379:6379"
restart: unless-stopped
Configure both the proxy and dispatcher to use Redis Streams:
LLM_PROXY_EVENT_BUS=redis-streams llm-proxy server ...
LLM_PROXY_EVENT_BUS=redis-streams llm-proxy dispatcher ...
This enables full async event delivery and observability pipeline testing across processes.
Redis Streams Backend (Recommended for Production)
For production deployments requiring guaranteed delivery and at-least-once semantics, use the Redis Streams backend. It provides:
- Consumer Groups: Multiple dispatcher instances can share the workload
- Acknowledgment: Messages are only removed after successful processing
- Crash Recovery: Pending messages from crashed consumers are automatically claimed
- Durable Storage: Messages persist until acknowledged, surviving restarts
Enabling Redis Streams
Set the event bus backend to redis-streams:
LLM_PROXY_EVENT_BUS=redis-streams llm-proxy server ...
Configuration Options
| Environment Variable | Description | Default |
|---|---|---|
LLM_PROXY_EVENT_BUS |
Event bus backend | redis-streams |
REDIS_ADDR |
Redis server address | localhost:6379 |
REDIS_DB |
Redis database number | 0 |
REDIS_STREAM_KEY |
Stream key name | llm-proxy-events |
REDIS_CONSUMER_GROUP |
Consumer group name | llm-proxy-dispatchers |
REDIS_CONSUMER_NAME |
Consumer name (unique per instance) | Auto-generated |
REDIS_STREAM_MAX_LEN |
Max stream length (0 = unlimited) | 10000 |
REDIS_STREAM_BLOCK_TIME |
Block timeout for reading | 5s |
REDIS_STREAM_CLAIM_TIME |
Min idle time before claiming pending messages | 30s |
REDIS_STREAM_BATCH_SIZE |
Batch size for reading messages | 100 |
Example Configuration
# Full Redis Streams configuration
export LLM_PROXY_EVENT_BUS=redis-streams
export REDIS_ADDR=redis.example.com:6379
export REDIS_DB=0
export REDIS_STREAM_KEY=llm-proxy-events
export REDIS_CONSUMER_GROUP=dispatchers
export REDIS_CONSUMER_NAME=dispatcher-1 # Set unique name per instance
export REDIS_STREAM_MAX_LEN=50000
export REDIS_STREAM_BLOCK_TIME=5s
export REDIS_STREAM_CLAIM_TIME=30s
export REDIS_STREAM_BATCH_SIZE=100
llm-proxy server
How It Works
- Publishing: Events are added to the stream via
XADDwith automatic ID generation - Consumer Groups: Dispatchers join a consumer group and read via
XREADGROUP - Acknowledgment: After successful processing, messages are acknowledged via
XACK - Recovery: If a consumer crashes, its pending messages are claimed by other consumers after
REDIS_STREAM_CLAIM_TIME
Multiple Dispatcher Instances
Redis Streams supports running multiple dispatcher instances that share the workload:
# Instance 1
REDIS_CONSUMER_NAME=dispatcher-1 llm-proxy dispatcher --service lunary
# Instance 2 (on another host or container)
REDIS_CONSUMER_NAME=dispatcher-2 llm-proxy dispatcher --service lunary
Each message is delivered to exactly one consumer in the group. If a consumer fails, its pending messages are automatically reassigned.
Multiple Dispatcher Services (Fan-out)
If you want multiple backends (e.g. file and helicone) to each receive 100% of events, do not run them in the same consumer group.
- Same
REDIS_CONSUMER_GROUPacross multiple dispatcher services = load balancing (each event goes to only one service) - Different
REDIS_CONSUMER_GROUPper service = fan-out (each service reads the full stream independently)
Example:
# File logger consumes all events
REDIS_CONSUMER_GROUP=llm-proxy-dispatchers-file \
llm-proxy dispatcher --service file --endpoint events.jsonl
# Helicone logger also consumes all events
REDIS_CONSUMER_GROUP=llm-proxy-dispatchers-helicone \
llm-proxy dispatcher --service helicone --api-key $HELICONE_API_KEY
Redis Streams vs In-Memory
| Feature | In-Memory | Redis Streams |
|---|---|---|
| Delivery guarantee | None (buffer overflow drops events) | At-least-once |
| Processes | Single process only | Distributed across multiple processes/hosts |
| Consumer groups | No | Yes |
| Multiple dispatchers | No | Yes (events distributed via consumer groups) |
| Crash recovery | No | Yes (pending message claiming) |
| Acknowledgment | No | Yes |
| Recommended for | Development, local testing | Production, high reliability |
Redis Streams Rollout Checklist
Use this checklist when enabling Redis Streams in new environments:
Prerequisites
- Redis server accessible from all proxy and dispatcher instances
MANAGEMENT_TOKENconfigured for admin operations
Configuration
- Set
LLM_PROXY_EVENT_BUS=redis-streamson proxy and dispatcher - Set
REDIS_ADDRto your Redis server address - Set
REDIS_STREAM_KEY(default:llm-proxy-events) - Set
REDIS_CONSUMER_GROUP(default:llm-proxy-dispatchers) - Configure
REDIS_STREAM_MAX_LENbased on expected throughput (default: 10000)
Verification
- Verify consumer group exists:
redis-cli XINFO GROUPS llm-proxy-events - Check stream length:
redis-cli XLEN llm-proxy-events - Monitor pending count:
redis-cli XPENDING llm-proxy-events llm-proxy-dispatchers - Verify dispatcher is consuming: check logs for “Using Redis Streams event bus”
- Confirm events are being acknowledged: pending count should remain stable or decrease
Monitoring
- Set up alerts for pending count > 1000 (indicates dispatcher lag)
- Monitor stream length to ensure it stays below max length
- Track dispatcher health endpoint for lag warnings
- Monitor dispatcher logs for claim/recovery messages
Troubleshooting
High Pending Count:
- Increase
REDIS_STREAM_BATCH_SIZE(default: 100) - Reduce
REDIS_STREAM_CLAIM_TIMEto claim stuck messages faster (default: 30s) - Scale horizontally: add more dispatcher instances (they share workload via consumer group)
- Check dispatcher logs for errors or slow backend API calls
Stream Length Growing:
- Increase
REDIS_STREAM_MAX_LENif losing events due to trimming - Ensure dispatchers are running and healthy
- Check that dispatchers are acknowledging messages (XACK)
References
- See
internal/middleware/instrumentation.gofor the middleware implementation. - See
internal/eventbus/eventbus.gofor the event bus interface and in-memory backend. - See
internal/dispatcher/for the pluggable dispatcher architecture. - See
docs/issues/done/phase-5-generic-async-middleware.mdfor the original design issue. - See
docs/issues/done/phase-5-event-dispatcher-service.mdfor the dispatcher design.
For questions or advanced integration, open an issue or see the code comments for extension points.