708 lines
18 KiB
Markdown
708 lines
18 KiB
Markdown
|
|
# 🚀 Deployment Guide
|
||
|
|
|
||
|
|
This guide covers deploying the Agentic RAG system in production environments, including Docker containerization, cloud deployment, and infrastructure requirements.
|
||
|
|
|
||
|
|
## Production Architecture
|
||
|
|
|
||
|
|
```
|
||
|
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||
|
|
│ Load Balancer │ │ Application │ │ Database │
|
||
|
|
│ (nginx/ALB) │◄──►│ Containers │◄──►│ (PostgreSQL) │
|
||
|
|
│ │ │ │ │ │
|
||
|
|
└─────────────────┘ └──────────────────┘ └─────────────────┘
|
||
|
|
│ │ │
|
||
|
|
▼ ▼ ▼
|
||
|
|
SSL Termination FastAPI + Next.js Session Storage
|
||
|
|
Domain Routing Auto-scaling Managed Service
|
||
|
|
Rate Limiting Health Monitoring Backup & Recovery
|
||
|
|
```
|
||
|
|
|
||
|
|
## Infrastructure Requirements
|
||
|
|
|
||
|
|
### Minimum Requirements
|
||
|
|
- **CPU**: 2 vCPU cores
|
||
|
|
- **Memory**: 4 GB RAM
|
||
|
|
- **Storage**: 20 GB SSD
|
||
|
|
- **Network**: 1 Gbps bandwidth
|
||
|
|
|
||
|
|
### Recommended Production
|
||
|
|
- **CPU**: 4+ vCPU cores
|
||
|
|
- **Memory**: 8+ GB RAM
|
||
|
|
- **Storage**: 50+ GB SSD (with backup)
|
||
|
|
- **Network**: 10+ Gbps bandwidth
|
||
|
|
- **Auto-scaling**: 2-10 instances
|
||
|
|
|
||
|
|
### Database Requirements
|
||
|
|
- **PostgreSQL 13+**
|
||
|
|
- **Storage**: 10+ GB (depends on retention policy)
|
||
|
|
- **Connections**: 100+ concurrent connections
|
||
|
|
- **Backup**: Daily automated backups
|
||
|
|
- **SSL**: Required for production
|
||
|
|
|
||
|
|
## Docker Deployment
|
||
|
|
|
||
|
|
### 1. Dockerfile for Backend
|
||
|
|
|
||
|
|
Create `Dockerfile` in the project root:
|
||
|
|
|
||
|
|
```dockerfile
|
||
|
|
# Multi-stage build for Python backend
|
||
|
|
FROM python:3.12-slim as backend-builder
|
||
|
|
|
||
|
|
# Install system dependencies
|
||
|
|
RUN apt-get update && apt-get install -y \
|
||
|
|
build-essential \
|
||
|
|
libpq-dev \
|
||
|
|
&& rm -rf /var/lib/apt/lists/*
|
||
|
|
|
||
|
|
# Install uv
|
||
|
|
RUN pip install uv
|
||
|
|
|
||
|
|
# Set working directory
|
||
|
|
WORKDIR /app
|
||
|
|
|
||
|
|
# Copy dependency files
|
||
|
|
COPY pyproject.toml uv.lock ./
|
||
|
|
|
||
|
|
# Install dependencies
|
||
|
|
RUN uv sync --no-dev --no-editable
|
||
|
|
|
||
|
|
# Production stage
|
||
|
|
FROM python:3.12-slim as backend
|
||
|
|
|
||
|
|
# Install runtime dependencies
|
||
|
|
RUN apt-get update && apt-get install -y \
|
||
|
|
libpq5 \
|
||
|
|
curl \
|
||
|
|
&& rm -rf /var/lib/apt/lists/*
|
||
|
|
|
||
|
|
# Create non-root user
|
||
|
|
RUN useradd --create-home --shell /bin/bash app
|
||
|
|
|
||
|
|
# Set working directory
|
||
|
|
WORKDIR /app
|
||
|
|
|
||
|
|
# Copy installed dependencies from builder
|
||
|
|
COPY --from=backend-builder /app/.venv /app/.venv
|
||
|
|
|
||
|
|
# Copy application code
|
||
|
|
COPY service/ service/
|
||
|
|
COPY config.yaml .
|
||
|
|
COPY scripts/ scripts/
|
||
|
|
|
||
|
|
# Set permissions
|
||
|
|
RUN chown -R app:app /app
|
||
|
|
|
||
|
|
# Switch to non-root user
|
||
|
|
USER app
|
||
|
|
|
||
|
|
# Add .venv to PATH
|
||
|
|
ENV PATH="/app/.venv/bin:$PATH"
|
||
|
|
|
||
|
|
# Health check
|
||
|
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
||
|
|
CMD curl -f http://localhost:8000/health || exit 1
|
||
|
|
|
||
|
|
# Expose port
|
||
|
|
EXPOSE 8000
|
||
|
|
|
||
|
|
# Start command
|
||
|
|
CMD ["uvicorn", "service.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
|
||
|
|
```
|
||
|
|
|
||
|
|
### 2. Dockerfile for Frontend
|
||
|
|
|
||
|
|
Create `web/Dockerfile`:
|
||
|
|
|
||
|
|
```dockerfile
|
||
|
|
# Frontend build stage
|
||
|
|
FROM node:18-alpine as frontend-builder
|
||
|
|
|
||
|
|
WORKDIR /app
|
||
|
|
|
||
|
|
# Copy package files
|
||
|
|
COPY package*.json ./
|
||
|
|
COPY pnpm-lock.yaml ./
|
||
|
|
|
||
|
|
# Install dependencies
|
||
|
|
RUN npm install -g pnpm
|
||
|
|
RUN pnpm install --frozen-lockfile
|
||
|
|
|
||
|
|
# Copy source code
|
||
|
|
COPY . .
|
||
|
|
|
||
|
|
# Build application
|
||
|
|
RUN pnpm run build
|
||
|
|
|
||
|
|
# Production stage
|
||
|
|
FROM node:18-alpine as frontend
|
||
|
|
|
||
|
|
WORKDIR /app
|
||
|
|
|
||
|
|
# Create non-root user
|
||
|
|
RUN addgroup -g 1001 -S nodejs
|
||
|
|
RUN adduser -S nextjs -u 1001
|
||
|
|
|
||
|
|
# Copy built application
|
||
|
|
COPY --from=frontend-builder /app/public ./public
|
||
|
|
COPY --from=frontend-builder /app/.next/standalone ./
|
||
|
|
COPY --from=frontend-builder /app/.next/static ./.next/static
|
||
|
|
|
||
|
|
# Set permissions
|
||
|
|
RUN chown -R nextjs:nodejs /app
|
||
|
|
|
||
|
|
USER nextjs
|
||
|
|
|
||
|
|
EXPOSE 3000
|
||
|
|
|
||
|
|
ENV PORT 3000
|
||
|
|
ENV HOSTNAME "0.0.0.0"
|
||
|
|
|
||
|
|
CMD ["node", "server.js"]
|
||
|
|
```
|
||
|
|
|
||
|
|
### 3. Docker Compose for Local Production
|
||
|
|
|
||
|
|
Create `docker-compose.prod.yml`:
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
version: '3.8'
|
||
|
|
|
||
|
|
services:
|
||
|
|
postgres:
|
||
|
|
image: postgres:15-alpine
|
||
|
|
environment:
|
||
|
|
POSTGRES_DB: agent_memory
|
||
|
|
POSTGRES_USER: ${POSTGRES_USER:-agent}
|
||
|
|
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
|
||
|
|
volumes:
|
||
|
|
- postgres_data:/var/lib/postgresql/data
|
||
|
|
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
|
||
|
|
ports:
|
||
|
|
- "5432:5432"
|
||
|
|
healthcheck:
|
||
|
|
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-agent}"]
|
||
|
|
interval: 30s
|
||
|
|
timeout: 10s
|
||
|
|
retries: 5
|
||
|
|
|
||
|
|
backend:
|
||
|
|
build:
|
||
|
|
context: .
|
||
|
|
dockerfile: Dockerfile
|
||
|
|
environment:
|
||
|
|
- OPENAI_API_KEY=${OPENAI_API_KEY}
|
||
|
|
- RETRIEVAL_API_KEY=${RETRIEVAL_API_KEY}
|
||
|
|
- DATABASE_URL=postgresql://${POSTGRES_USER:-agent}:${POSTGRES_PASSWORD}@postgres:5432/agent_memory
|
||
|
|
depends_on:
|
||
|
|
postgres:
|
||
|
|
condition: service_healthy
|
||
|
|
ports:
|
||
|
|
- "8000:8000"
|
||
|
|
healthcheck:
|
||
|
|
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||
|
|
interval: 30s
|
||
|
|
timeout: 10s
|
||
|
|
retries: 3
|
||
|
|
|
||
|
|
frontend:
|
||
|
|
build:
|
||
|
|
context: ./web
|
||
|
|
dockerfile: Dockerfile
|
||
|
|
environment:
|
||
|
|
- NEXT_PUBLIC_LANGGRAPH_API_URL=http://backend:8000/api
|
||
|
|
depends_on:
|
||
|
|
- backend
|
||
|
|
ports:
|
||
|
|
- "3000:3000"
|
||
|
|
|
||
|
|
nginx:
|
||
|
|
image: nginx:alpine
|
||
|
|
ports:
|
||
|
|
- "80:80"
|
||
|
|
- "443:443"
|
||
|
|
volumes:
|
||
|
|
- ./nginx.conf:/etc/nginx/nginx.conf
|
||
|
|
- ./ssl:/etc/nginx/ssl
|
||
|
|
depends_on:
|
||
|
|
- frontend
|
||
|
|
- backend
|
||
|
|
|
||
|
|
volumes:
|
||
|
|
postgres_data:
|
||
|
|
```
|
||
|
|
|
||
|
|
### 4. Environment Configuration
|
||
|
|
|
||
|
|
Create `.env.prod`:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Database
|
||
|
|
POSTGRES_USER=agent
|
||
|
|
POSTGRES_PASSWORD=your-secure-password
|
||
|
|
DATABASE_URL=postgresql://agent:your-secure-password@postgres:5432/agent_memory
|
||
|
|
|
||
|
|
# LLM API
|
||
|
|
OPENAI_API_KEY=your-openai-key
|
||
|
|
AZURE_OPENAI_API_KEY=your-azure-key
|
||
|
|
RETRIEVAL_API_KEY=your-retrieval-key
|
||
|
|
|
||
|
|
# Application
|
||
|
|
LOG_LEVEL=INFO
|
||
|
|
CORS_ORIGINS=["https://yourdomain.com"]
|
||
|
|
MAX_TOOL_LOOPS=5
|
||
|
|
MEMORY_TTL_DAYS=7
|
||
|
|
|
||
|
|
# Next.js
|
||
|
|
NEXT_PUBLIC_LANGGRAPH_API_URL=https://yourdomain.com/api
|
||
|
|
NODE_ENV=production
|
||
|
|
```
|
||
|
|
|
||
|
|
## Cloud Deployment
|
||
|
|
|
||
|
|
### Azure Container Instances
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Create resource group
|
||
|
|
az group create --name agentic-rag-rg --location eastus
|
||
|
|
|
||
|
|
# Create container registry
|
||
|
|
az acr create --resource-group agentic-rag-rg \
|
||
|
|
--name agenticragacr --sku Basic
|
||
|
|
|
||
|
|
# Build and push images
|
||
|
|
az acr build --registry agenticragacr \
|
||
|
|
--image agentic-rag-backend:latest .
|
||
|
|
|
||
|
|
# Create PostgreSQL database
|
||
|
|
az postgres flexible-server create \
|
||
|
|
--resource-group agentic-rag-rg \
|
||
|
|
--name agentic-rag-db \
|
||
|
|
--admin-user agentadmin \
|
||
|
|
--admin-password YourSecurePassword123! \
|
||
|
|
--sku-name Standard_B1ms \
|
||
|
|
--tier Burstable \
|
||
|
|
--public-access 0.0.0.0 \
|
||
|
|
--storage-size 32
|
||
|
|
|
||
|
|
# Deploy container instance
|
||
|
|
az container create \
|
||
|
|
--resource-group agentic-rag-rg \
|
||
|
|
--name agentic-rag-backend \
|
||
|
|
--image agenticragacr.azurecr.io/agentic-rag-backend:latest \
|
||
|
|
--registry-login-server agenticragacr.azurecr.io \
|
||
|
|
--registry-username agenticragacr \
|
||
|
|
--registry-password $(az acr credential show --name agenticragacr --query "passwords[0].value" -o tsv) \
|
||
|
|
--dns-name-label agentic-rag-api \
|
||
|
|
--ports 8000 \
|
||
|
|
--environment-variables \
|
||
|
|
OPENAI_API_KEY=$OPENAI_API_KEY \
|
||
|
|
DATABASE_URL=$DATABASE_URL
|
||
|
|
```
|
||
|
|
|
||
|
|
### AWS ECS Deployment
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"family": "agentic-rag-backend",
|
||
|
|
"networkMode": "awsvpc",
|
||
|
|
"requiresCompatibilities": ["FARGATE"],
|
||
|
|
"cpu": "1024",
|
||
|
|
"memory": "2048",
|
||
|
|
"executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole",
|
||
|
|
"taskRoleArn": "arn:aws:iam::account:role/ecsTaskRole",
|
||
|
|
"containerDefinitions": [
|
||
|
|
{
|
||
|
|
"name": "backend",
|
||
|
|
"image": "your-account.dkr.ecr.region.amazonaws.com/agentic-rag-backend:latest",
|
||
|
|
"portMappings": [
|
||
|
|
{
|
||
|
|
"containerPort": 8000,
|
||
|
|
"protocol": "tcp"
|
||
|
|
}
|
||
|
|
],
|
||
|
|
"environment": [
|
||
|
|
{
|
||
|
|
"name": "DATABASE_URL",
|
||
|
|
"value": "postgresql://user:pass@rds-endpoint:5432/dbname"
|
||
|
|
}
|
||
|
|
],
|
||
|
|
"secrets": [
|
||
|
|
{
|
||
|
|
"name": "OPENAI_API_KEY",
|
||
|
|
"valueFrom": "arn:aws:secretsmanager:region:account:secret:openai-key"
|
||
|
|
}
|
||
|
|
],
|
||
|
|
"logConfiguration": {
|
||
|
|
"logDriver": "awslogs",
|
||
|
|
"options": {
|
||
|
|
"awslogs-group": "/ecs/agentic-rag",
|
||
|
|
"awslogs-region": "us-east-1",
|
||
|
|
"awslogs-stream-prefix": "backend"
|
||
|
|
}
|
||
|
|
},
|
||
|
|
"healthCheck": {
|
||
|
|
"command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"],
|
||
|
|
"interval": 30,
|
||
|
|
"timeout": 10,
|
||
|
|
"retries": 3,
|
||
|
|
"startPeriod": 60
|
||
|
|
}
|
||
|
|
}
|
||
|
|
]
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Load Balancer Configuration
|
||
|
|
|
||
|
|
### Nginx Configuration
|
||
|
|
|
||
|
|
Create `nginx.conf`:
|
||
|
|
|
||
|
|
```nginx
|
||
|
|
events {
|
||
|
|
worker_connections 1024;
|
||
|
|
}
|
||
|
|
|
||
|
|
http {
|
||
|
|
upstream backend {
|
||
|
|
server backend:8000;
|
||
|
|
}
|
||
|
|
|
||
|
|
upstream frontend {
|
||
|
|
server frontend:3000;
|
||
|
|
}
|
||
|
|
|
||
|
|
# Rate limiting
|
||
|
|
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
|
||
|
|
limit_req_zone $binary_remote_addr zone=chat:10m rate=5r/s;
|
||
|
|
|
||
|
|
server {
|
||
|
|
listen 80;
|
||
|
|
server_name yourdomain.com;
|
||
|
|
return 301 https://$server_name$request_uri;
|
||
|
|
}
|
||
|
|
|
||
|
|
server {
|
||
|
|
listen 443 ssl http2;
|
||
|
|
server_name yourdomain.com;
|
||
|
|
|
||
|
|
ssl_certificate /etc/nginx/ssl/cert.pem;
|
||
|
|
ssl_certificate_key /etc/nginx/ssl/key.pem;
|
||
|
|
ssl_protocols TLSv1.2 TLSv1.3;
|
||
|
|
ssl_ciphers HIGH:!aNULL:!MD5;
|
||
|
|
|
||
|
|
# Frontend
|
||
|
|
location / {
|
||
|
|
proxy_pass http://frontend;
|
||
|
|
proxy_set_header Host $host;
|
||
|
|
proxy_set_header X-Real-IP $remote_addr;
|
||
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
||
|
|
}
|
||
|
|
|
||
|
|
# API endpoints
|
||
|
|
location /api/ {
|
||
|
|
limit_req zone=api burst=20 nodelay;
|
||
|
|
|
||
|
|
proxy_pass http://backend;
|
||
|
|
proxy_set_header Host $host;
|
||
|
|
proxy_set_header X-Real-IP $remote_addr;
|
||
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
||
|
|
|
||
|
|
# SSE specific settings
|
||
|
|
proxy_buffering off;
|
||
|
|
proxy_cache off;
|
||
|
|
proxy_set_header Connection '';
|
||
|
|
proxy_http_version 1.1;
|
||
|
|
chunked_transfer_encoding off;
|
||
|
|
}
|
||
|
|
|
||
|
|
# Chat endpoint with stricter rate limiting
|
||
|
|
location /api/chat {
|
||
|
|
limit_req zone=chat burst=10 nodelay;
|
||
|
|
|
||
|
|
proxy_pass http://backend;
|
||
|
|
proxy_set_header Host $host;
|
||
|
|
proxy_set_header X-Real-IP $remote_addr;
|
||
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
||
|
|
|
||
|
|
# SSE specific settings
|
||
|
|
proxy_buffering off;
|
||
|
|
proxy_cache off;
|
||
|
|
proxy_read_timeout 300s;
|
||
|
|
proxy_set_header Connection '';
|
||
|
|
proxy_http_version 1.1;
|
||
|
|
chunked_transfer_encoding off;
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Monitoring and Observability
|
||
|
|
|
||
|
|
### Health Checks
|
||
|
|
|
||
|
|
Configure comprehensive health checks:
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Enhanced health check endpoint
|
||
|
|
@app.get("/health/detailed")
|
||
|
|
async def detailed_health():
|
||
|
|
health_status = {
|
||
|
|
"status": "healthy",
|
||
|
|
"service": "agentic-rag",
|
||
|
|
"version": "0.8.0",
|
||
|
|
"timestamp": datetime.utcnow().isoformat(),
|
||
|
|
"components": {}
|
||
|
|
}
|
||
|
|
|
||
|
|
# Database connectivity
|
||
|
|
try:
|
||
|
|
memory_manager = get_memory_manager()
|
||
|
|
db_healthy = memory_manager.test_connection()
|
||
|
|
health_status["components"]["database"] = {
|
||
|
|
"status": "healthy" if db_healthy else "unhealthy",
|
||
|
|
"type": "postgresql"
|
||
|
|
}
|
||
|
|
except Exception as e:
|
||
|
|
health_status["components"]["database"] = {
|
||
|
|
"status": "unhealthy",
|
||
|
|
"error": str(e)
|
||
|
|
}
|
||
|
|
|
||
|
|
# LLM API connectivity
|
||
|
|
try:
|
||
|
|
config = get_config()
|
||
|
|
# Test LLM connection
|
||
|
|
health_status["components"]["llm"] = {
|
||
|
|
"status": "healthy",
|
||
|
|
"provider": config.provider
|
||
|
|
}
|
||
|
|
except Exception as e:
|
||
|
|
health_status["components"]["llm"] = {
|
||
|
|
"status": "unhealthy",
|
||
|
|
"error": str(e)
|
||
|
|
}
|
||
|
|
|
||
|
|
# Overall status
|
||
|
|
all_healthy = all(
|
||
|
|
comp.get("status") == "healthy"
|
||
|
|
for comp in health_status["components"].values()
|
||
|
|
)
|
||
|
|
health_status["status"] = "healthy" if all_healthy else "degraded"
|
||
|
|
|
||
|
|
return health_status
|
||
|
|
```
|
||
|
|
|
||
|
|
### Logging Configuration
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
# logging.yaml
|
||
|
|
version: 1
|
||
|
|
disable_existing_loggers: false
|
||
|
|
|
||
|
|
formatters:
|
||
|
|
standard:
|
||
|
|
format: '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
|
||
|
|
json:
|
||
|
|
format: '{"timestamp": "%(asctime)s", "level": "%(levelname)s", "logger": "%(name)s", "message": "%(message)s", "module": "%(module)s", "function": "%(funcName)s", "line": %(lineno)d}'
|
||
|
|
|
||
|
|
handlers:
|
||
|
|
console:
|
||
|
|
class: logging.StreamHandler
|
||
|
|
level: INFO
|
||
|
|
formatter: standard
|
||
|
|
stream: ext://sys.stdout
|
||
|
|
|
||
|
|
file:
|
||
|
|
class: logging.handlers.RotatingFileHandler
|
||
|
|
level: INFO
|
||
|
|
formatter: json
|
||
|
|
filename: /app/logs/app.log
|
||
|
|
maxBytes: 10485760 # 10MB
|
||
|
|
backupCount: 5
|
||
|
|
|
||
|
|
loggers:
|
||
|
|
service:
|
||
|
|
level: INFO
|
||
|
|
handlers: [console, file]
|
||
|
|
propagate: false
|
||
|
|
|
||
|
|
uvicorn:
|
||
|
|
level: INFO
|
||
|
|
handlers: [console]
|
||
|
|
propagate: false
|
||
|
|
|
||
|
|
root:
|
||
|
|
level: INFO
|
||
|
|
handlers: [console, file]
|
||
|
|
```
|
||
|
|
|
||
|
|
### Metrics Collection
|
||
|
|
|
||
|
|
```python
|
||
|
|
# metrics.py
|
||
|
|
from prometheus_client import Counter, Histogram, Gauge, generate_latest
|
||
|
|
|
||
|
|
# Metrics
|
||
|
|
REQUEST_COUNT = Counter('http_requests_total', 'Total HTTP requests', ['method', 'endpoint'])
|
||
|
|
REQUEST_DURATION = Histogram('http_request_duration_seconds', 'HTTP request duration')
|
||
|
|
ACTIVE_SESSIONS = Gauge('active_sessions_total', 'Number of active chat sessions')
|
||
|
|
TOOL_CALLS = Counter('tool_calls_total', 'Total tool calls', ['tool_name', 'status'])
|
||
|
|
|
||
|
|
@app.middleware("http")
|
||
|
|
async def metrics_middleware(request: Request, call_next):
|
||
|
|
start_time = time.time()
|
||
|
|
response = await call_next(request)
|
||
|
|
duration = time.time() - start_time
|
||
|
|
|
||
|
|
REQUEST_COUNT.labels(
|
||
|
|
method=request.method,
|
||
|
|
endpoint=request.url.path
|
||
|
|
).inc()
|
||
|
|
REQUEST_DURATION.observe(duration)
|
||
|
|
|
||
|
|
return response
|
||
|
|
|
||
|
|
@app.get("/metrics")
|
||
|
|
async def get_metrics():
|
||
|
|
return Response(generate_latest(), media_type="text/plain")
|
||
|
|
```
|
||
|
|
|
||
|
|
## Security Configuration
|
||
|
|
|
||
|
|
### Environment Variables Security
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Use a secrets management service in production
|
||
|
|
export OPENAI_API_KEY=$(aws secretsmanager get-secret-value --secret-id openai-key --query SecretString --output text)
|
||
|
|
export DATABASE_PASSWORD=$(azure keyvault secret show --vault-name MyKeyVault --name db-password --query value -o tsv)
|
||
|
|
```
|
||
|
|
|
||
|
|
### Network Security
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
# docker-compose.prod.yml security additions
|
||
|
|
services:
|
||
|
|
backend:
|
||
|
|
networks:
|
||
|
|
- backend-network
|
||
|
|
deploy:
|
||
|
|
resources:
|
||
|
|
limits:
|
||
|
|
memory: 2G
|
||
|
|
cpus: '1.0'
|
||
|
|
reservations:
|
||
|
|
memory: 1G
|
||
|
|
cpus: '0.5'
|
||
|
|
|
||
|
|
postgres:
|
||
|
|
networks:
|
||
|
|
- backend-network
|
||
|
|
# Only accessible from backend, not exposed publicly
|
||
|
|
|
||
|
|
networks:
|
||
|
|
backend-network:
|
||
|
|
driver: bridge
|
||
|
|
internal: true # Internal network only
|
||
|
|
```
|
||
|
|
|
||
|
|
### SSL/TLS Configuration
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Generate SSL certificates with Let's Encrypt
|
||
|
|
certbot certonly --webroot -w /var/www/html -d yourdomain.com
|
||
|
|
|
||
|
|
# Or use existing certificates
|
||
|
|
cp /path/to/your/cert.pem /etc/nginx/ssl/
|
||
|
|
cp /path/to/your/key.pem /etc/nginx/ssl/
|
||
|
|
```
|
||
|
|
|
||
|
|
## Deployment Checklist
|
||
|
|
|
||
|
|
### Pre-deployment
|
||
|
|
- [ ] **Environment Variables**: All secrets configured in secure storage
|
||
|
|
- [ ] **Database**: PostgreSQL instance created and accessible
|
||
|
|
- [ ] **SSL Certificates**: Valid certificates for HTTPS
|
||
|
|
- [ ] **Resource Limits**: CPU/memory limits configured
|
||
|
|
- [ ] **Backup Strategy**: Database backup schedule configured
|
||
|
|
|
||
|
|
### Deployment
|
||
|
|
- [ ] **Docker Images**: Built and pushed to registry
|
||
|
|
- [ ] **Load Balancer**: Configured with health checks
|
||
|
|
- [ ] **Database Migration**: Schema initialized
|
||
|
|
- [ ] **Configuration**: Production config.yaml deployed
|
||
|
|
- [ ] **Monitoring**: Health checks and metrics collection active
|
||
|
|
|
||
|
|
### Post-deployment
|
||
|
|
- [ ] **Health Check**: All endpoints responding correctly
|
||
|
|
- [ ] **Load Testing**: System performance under load verified
|
||
|
|
- [ ] **Log Monitoring**: Error rates and performance logs reviewed
|
||
|
|
- [ ] **Security Scan**: Vulnerability assessment completed
|
||
|
|
- [ ] **Backup Verification**: Database backup/restore tested
|
||
|
|
|
||
|
|
## Troubleshooting Production Issues
|
||
|
|
|
||
|
|
### Common Deployment Issues
|
||
|
|
|
||
|
|
**1. Database Connection Failures**
|
||
|
|
```bash
|
||
|
|
# Check PostgreSQL connectivity
|
||
|
|
psql -h your-db-host -U username -d database_name -c "SELECT 1;"
|
||
|
|
|
||
|
|
# Verify connection string format
|
||
|
|
echo $DATABASE_URL
|
||
|
|
```
|
||
|
|
|
||
|
|
**2. Container Health Check Failures**
|
||
|
|
```bash
|
||
|
|
# Check container logs
|
||
|
|
docker logs container-name
|
||
|
|
|
||
|
|
# Test health endpoint manually
|
||
|
|
curl -f http://localhost:8000/health
|
||
|
|
```
|
||
|
|
|
||
|
|
**3. SSL Certificate Issues**
|
||
|
|
```bash
|
||
|
|
# Verify certificate validity
|
||
|
|
openssl x509 -in /etc/nginx/ssl/cert.pem -text -noout
|
||
|
|
|
||
|
|
# Check certificate expiration
|
||
|
|
openssl x509 -in /etc/nginx/ssl/cert.pem -noout -dates
|
||
|
|
```
|
||
|
|
|
||
|
|
**4. High Memory Usage**
|
||
|
|
```bash
|
||
|
|
# Monitor memory usage
|
||
|
|
docker stats
|
||
|
|
|
||
|
|
# Check for memory leaks
|
||
|
|
docker exec -it container-name top
|
||
|
|
```
|
||
|
|
|
||
|
|
### Performance Optimization
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
# Production optimizations in config.yaml
|
||
|
|
app:
|
||
|
|
memory_ttl_days: 3 # Reduce memory usage
|
||
|
|
max_tool_loops: 3 # Limit computation
|
||
|
|
|
||
|
|
postgresql:
|
||
|
|
pool_size: 20 # Connection pooling
|
||
|
|
max_overflow: 0 # Prevent connection leaks
|
||
|
|
|
||
|
|
llm:
|
||
|
|
rag:
|
||
|
|
max_context_length: 32000 # Reduce context window if needed
|
||
|
|
temperature: 0.1 # More deterministic responses
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
This deployment guide covers the essential aspects of running the Agentic RAG system in production. For specific cloud providers or deployment scenarios not covered here, consult the provider's documentation and adapt these configurations accordingly.
|