The Grafana dashboard expects redis_key_size{key="mlx:pool:prod"} — standard redis_exporter does not expose arbitrary key lengths. This recipe adds a **lightweight sidecar** that polls Redis and exposes custom gauges for queue, pool, DLQ, and active leases.
Metrics to export
| Redis key/pattern | Metric | Grafana use |
|---|---|---|
LLEN mlx:jobs | mlx_queue_depth | Backpressure alert |
LLEN mlx:dlq | mlx_dlq_depth | DLQ alert |
SCARD mlx:pool:prod | mlx_pool_depth{tier="prod"} | Capacity planning |
SCARD mlx:pool:burn | mlx_pool_depth{tier="burn"} | Ban cluster detection |
KEYS mlx:lease:* | mlx_active_leases | Orphan lease debug |
Avoid KEYS in prod at scale — use SCAN with prefix mlx:lease:.
Python sidecar (prometheus_client)
import redis
import time
from prometheus_client import Gauge, start_http_server
r = redis.Redis(decode_responses=True)
QUEUE = Gauge("mlx_queue_depth", "Jobs waiting in mlx:jobs")
DLQ = Gauge("mlx_dlq_depth", "Jobs in dead letter queue")
POOL = Gauge("mlx_pool_depth", "Profiles in pool set", ["tier"])
LEASES = Gauge("mlx_active_leases", "Active profile leases")
def poll():
QUEUE.set(r.llen("mlx:jobs"))
DLQ.set(r.llen("mlx:dlq"))
for tier in ("prod", "warm", "burn"):
POOL.labels(tier=tier).set(r.scard(f"mlx:pool:{tier}"))
count = 0
cursor = 0
while True:
cursor, keys = r.scan(cursor, match="mlx:lease:*", count=100)
count += len(keys)
if cursor == 0:
break
LEASES.set(count)
if __name__ == "__main__":
start_http_server(9101) # sidecar port, workers stay on 9100
while True:
poll()
time.sleep(30)
Docker Compose sidecar
services:
redis:
image: redis:7-alpine
ports: ["6379:6379"]
mlx-redis-exporter:
build: ./sidecar
environment:
REDIS_URL: redis://redis:6379/0
ports: ["9101:9101"]
prometheus:
image: prom/prometheus
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports: ["9090:9090"]
prometheus.yml scrape config
scrape_configs:
- job_name: mlx_worker
static_configs:
- targets: ["worker:9100"]
- job_name: mlx_redis_sidecar
static_configs:
- targets: ["mlx-redis-exporter:9101"]
Pushgateway fallback (cron)
# If sidecar not deployed — cron every 60s
python -c "
import redis, requests
from prometheus_client import CollectorRegistry, Gauge, push_to_gateway
r = redis.Redis()
reg = CollectorRegistry()
g = Gauge('mlx_queue_depth', 'queue', registry=reg)
g.set(r.llen('mlx:jobs'))
push_to_gateway('pushgateway:9091', job='mlx_redis', registry=reg)
"
Alert wiring
Connect to Grafana alerts: mlx_dlq_depth > 10, mlx_pool_depth{tier="prod"} < 5, mlx_active_leases > max_concurrent.
Related
Disclosure: MLX-MMO affiliated with Multilogin.