Learn how to scale Laddr agents horizontally and deploy to production environments.
Horizontal Scaling
Scale Workers
Scale agent workers to handle increased load:
# Scale a single agent type
laddr scale researcher 5
# Scale multiple agents
laddr scale researcher 5
laddr scale coordinator 3
laddr scale writer 2
Docker Compose Scaling
Scale workers using Docker Compose:
# Scale coordinator workers to 3 instances
docker compose up -d --scale coordinator_worker=3
# Scale all workers
docker compose up -d \\
--scale coordinator_worker=3 \\
--scale researcher_worker=3 \\
--scale analyzer_worker=2 \\
--scale writer_worker=2
Queue Backends
Redis (Development)
Fast, lightweight queue backend for development:
# .env
QUEUE_BACKEND=redis
REDIS_URL=redis://localhost:6379/0
Kafka (Production)
Durable, scalable queue backend for production:
# .env
QUEUE_BACKEND=kafka
KAFKA_BOOTSTRAP=kafka:9092
Kafka provides better message persistence and horizontal scaling capabilities for production workloads.
Memory (Testing)
In-memory queue for local testing:
# .env
QUEUE_BACKEND=memory
Memory backend only works within a single process. Use Redis or Kafka for multi-worker deployments.
Database Configuration
PostgreSQL (Production)
Use PostgreSQL for production deployments:
# .env
DB_BACKEND=postgresql
DATABASE_URL=postgresql://user:password@localhost:5432/laddr
SQLite (Development)
SQLite for local development:
# .env
DB_BACKEND=sqlite
DATABASE_URL=sqlite:///./laddr.db
Monitoring
Dashboard
Access the dashboard for real-time monitoring:
# Start dashboard
laddr run dev -d
# Access at http://localhost:5173
Metrics
Monitor key metrics:
- Queue Depth - Number of pending tasks
- Worker Utilization - Active workers vs idle
- Throughput - Tasks processed per second
- Error Rate - Failed tasks percentage
- Latency - Average task completion time
Logs
View and follow logs:
# Follow logs for an agent
laddr logs researcher --follow
# Show last 100 lines
laddr logs researcher --tail 100
# View all service logs
docker compose logs -f
Production Deployment
Environment Variables
Configure production environment:
# .env.production
# Queue
QUEUE_BACKEND=kafka
KAFKA_BOOTSTRAP=kafka-cluster:9092
# Database
DB_BACKEND=postgresql
DATABASE_URL=postgresql://user:pass@db-host:5432/laddr
# Storage
STORAGE_BACKEND=s3
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
# LLM
LLM_PROVIDER=openai
OPENAI_API_KEY=...
Health Checks
Implement health checks:
# Check system health
laddr check
# API health endpoint
curl http://localhost:8000/api/health
Resource Limits
Set appropriate resource limits:
# docker-compose.yml
services:
researcher_worker:
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '1'
memory: 1G
Load Balancing
Worker Distribution
Kafka automatically distributes tasks across workers:
Each worker in a consumer group processes a subset of tasks.
Partition Strategy
Configure Kafka partitions for better parallelism:
# More partitions = more parallelism
# Create topic with 10 partitions
kafka-topics --create \\
--bootstrap-server localhost:9092 \\
--topic laddr.tasks.researcher \\
--partitions 10 \\
--replication-factor 1
Worker Configuration
Optimize worker settings:
# .env
MAX_CONCURRENT_TASKS=5 # Tasks per worker
WORKER_PREFETCH=10 # Prefetch count
WORKER_TIMEOUT=300 # Task timeout
Database Connection Pooling
Configure connection pooling:
# .env
DB_POOL_SIZE=20
DB_MAX_OVERFLOW=10
Troubleshooting
High Queue Depth
If queue depth is growing:
- Scale up workers:
laddr scale researcher 10
- Check worker logs for errors
- Verify database/storage connectivity
- Check for slow tools or LLM calls
Worker Failures
If workers are failing:
- Check logs:
laddr logs researcher --tail 100
- Verify API keys and credentials
- Check resource limits (CPU/memory)
- Review error messages in dashboard
If performance is slow:
- Monitor dashboard metrics
- Check database query performance
- Review LLM response times
- Optimize tool implementations
- Consider caching strategies
Next Steps