# 🚀 Deployment Guide This guide covers deploying the Agentic RAG system in production environments, including Docker containerization, cloud deployment, and infrastructure requirements. ## Production Architecture ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Load Balancer │ │ Application │ │ Database │ │ (nginx/ALB) │◄──►│ Containers │◄──►│ (PostgreSQL) │ │ │ │ │ │ │ └─────────────────┘ └──────────────────┘ └─────────────────┘ │ │ │ ▼ ▼ ▼ SSL Termination FastAPI + Next.js Session Storage Domain Routing Auto-scaling Managed Service Rate Limiting Health Monitoring Backup & Recovery ``` ## Infrastructure Requirements ### Minimum Requirements - **CPU**: 2 vCPU cores - **Memory**: 4 GB RAM - **Storage**: 20 GB SSD - **Network**: 1 Gbps bandwidth ### Recommended Production - **CPU**: 4+ vCPU cores - **Memory**: 8+ GB RAM - **Storage**: 50+ GB SSD (with backup) - **Network**: 10+ Gbps bandwidth - **Auto-scaling**: 2-10 instances ### Database Requirements - **PostgreSQL 13+** - **Storage**: 10+ GB (depends on retention policy) - **Connections**: 100+ concurrent connections - **Backup**: Daily automated backups - **SSL**: Required for production ## Docker Deployment ### 1. Dockerfile for Backend Create `Dockerfile` in the project root: ```dockerfile # Multi-stage build for Python backend FROM python:3.12-slim as backend-builder # Install system dependencies RUN apt-get update && apt-get install -y \ build-essential \ libpq-dev \ && rm -rf /var/lib/apt/lists/* # Install uv RUN pip install uv # Set working directory WORKDIR /app # Copy dependency files COPY pyproject.toml uv.lock ./ # Install dependencies RUN uv sync --no-dev --no-editable # Production stage FROM python:3.12-slim as backend # Install runtime dependencies RUN apt-get update && apt-get install -y \ libpq5 \ curl \ && rm -rf /var/lib/apt/lists/* # Create non-root user RUN useradd --create-home --shell /bin/bash app # Set working directory WORKDIR /app # Copy installed dependencies from builder COPY --from=backend-builder /app/.venv /app/.venv # Copy application code COPY service/ service/ COPY config.yaml . COPY scripts/ scripts/ # Set permissions RUN chown -R app:app /app # Switch to non-root user USER app # Add .venv to PATH ENV PATH="/app/.venv/bin:$PATH" # Health check HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD curl -f http://localhost:8000/health || exit 1 # Expose port EXPOSE 8000 # Start command CMD ["uvicorn", "service.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"] ``` ### 2. Dockerfile for Frontend Create `web/Dockerfile`: ```dockerfile # Frontend build stage FROM node:18-alpine as frontend-builder WORKDIR /app # Copy package files COPY package*.json ./ COPY pnpm-lock.yaml ./ # Install dependencies RUN npm install -g pnpm RUN pnpm install --frozen-lockfile # Copy source code COPY . . # Build application RUN pnpm run build # Production stage FROM node:18-alpine as frontend WORKDIR /app # Create non-root user RUN addgroup -g 1001 -S nodejs RUN adduser -S nextjs -u 1001 # Copy built application COPY --from=frontend-builder /app/public ./public COPY --from=frontend-builder /app/.next/standalone ./ COPY --from=frontend-builder /app/.next/static ./.next/static # Set permissions RUN chown -R nextjs:nodejs /app USER nextjs EXPOSE 3000 ENV PORT 3000 ENV HOSTNAME "0.0.0.0" CMD ["node", "server.js"] ``` ### 3. Docker Compose for Local Production Create `docker-compose.prod.yml`: ```yaml version: '3.8' services: postgres: image: postgres:15-alpine environment: POSTGRES_DB: agent_memory POSTGRES_USER: ${POSTGRES_USER:-agent} POSTGRES_PASSWORD: ${POSTGRES_PASSWORD} volumes: - postgres_data:/var/lib/postgresql/data - ./init.sql:/docker-entrypoint-initdb.d/init.sql ports: - "5432:5432" healthcheck: test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-agent}"] interval: 30s timeout: 10s retries: 5 backend: build: context: . dockerfile: Dockerfile environment: - OPENAI_API_KEY=${OPENAI_API_KEY} - RETRIEVAL_API_KEY=${RETRIEVAL_API_KEY} - DATABASE_URL=postgresql://${POSTGRES_USER:-agent}:${POSTGRES_PASSWORD}@postgres:5432/agent_memory depends_on: postgres: condition: service_healthy ports: - "8000:8000" healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8000/health"] interval: 30s timeout: 10s retries: 3 frontend: build: context: ./web dockerfile: Dockerfile environment: - NEXT_PUBLIC_LANGGRAPH_API_URL=http://backend:8000/api depends_on: - backend ports: - "3000:3000" nginx: image: nginx:alpine ports: - "80:80" - "443:443" volumes: - ./nginx.conf:/etc/nginx/nginx.conf - ./ssl:/etc/nginx/ssl depends_on: - frontend - backend volumes: postgres_data: ``` ### 4. Environment Configuration Create `.env.prod`: ```bash # Database POSTGRES_USER=agent POSTGRES_PASSWORD=your-secure-password DATABASE_URL=postgresql://agent:your-secure-password@postgres:5432/agent_memory # LLM API OPENAI_API_KEY=your-openai-key AZURE_OPENAI_API_KEY=your-azure-key RETRIEVAL_API_KEY=your-retrieval-key # Application LOG_LEVEL=INFO CORS_ORIGINS=["https://yourdomain.com"] MAX_TOOL_LOOPS=5 MEMORY_TTL_DAYS=7 # Next.js NEXT_PUBLIC_LANGGRAPH_API_URL=https://yourdomain.com/api NODE_ENV=production ``` ## Cloud Deployment ### Azure Container Instances ```bash # Create resource group az group create --name agentic-rag-rg --location eastus # Create container registry az acr create --resource-group agentic-rag-rg \ --name agenticragacr --sku Basic # Build and push images az acr build --registry agenticragacr \ --image agentic-rag-backend:latest . # Create PostgreSQL database az postgres flexible-server create \ --resource-group agentic-rag-rg \ --name agentic-rag-db \ --admin-user agentadmin \ --admin-password YourSecurePassword123! \ --sku-name Standard_B1ms \ --tier Burstable \ --public-access 0.0.0.0 \ --storage-size 32 # Deploy container instance az container create \ --resource-group agentic-rag-rg \ --name agentic-rag-backend \ --image agenticragacr.azurecr.io/agentic-rag-backend:latest \ --registry-login-server agenticragacr.azurecr.io \ --registry-username agenticragacr \ --registry-password $(az acr credential show --name agenticragacr --query "passwords[0].value" -o tsv) \ --dns-name-label agentic-rag-api \ --ports 8000 \ --environment-variables \ OPENAI_API_KEY=$OPENAI_API_KEY \ DATABASE_URL=$DATABASE_URL ``` ### AWS ECS Deployment ```json { "family": "agentic-rag-backend", "networkMode": "awsvpc", "requiresCompatibilities": ["FARGATE"], "cpu": "1024", "memory": "2048", "executionRoleArn": "arn:aws:iam::account:role/ecsTaskExecutionRole", "taskRoleArn": "arn:aws:iam::account:role/ecsTaskRole", "containerDefinitions": [ { "name": "backend", "image": "your-account.dkr.ecr.region.amazonaws.com/agentic-rag-backend:latest", "portMappings": [ { "containerPort": 8000, "protocol": "tcp" } ], "environment": [ { "name": "DATABASE_URL", "value": "postgresql://user:pass@rds-endpoint:5432/dbname" } ], "secrets": [ { "name": "OPENAI_API_KEY", "valueFrom": "arn:aws:secretsmanager:region:account:secret:openai-key" } ], "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/agentic-rag", "awslogs-region": "us-east-1", "awslogs-stream-prefix": "backend" } }, "healthCheck": { "command": ["CMD-SHELL", "curl -f http://localhost:8000/health || exit 1"], "interval": 30, "timeout": 10, "retries": 3, "startPeriod": 60 } } ] } ``` ## Load Balancer Configuration ### Nginx Configuration Create `nginx.conf`: ```nginx events { worker_connections 1024; } http { upstream backend { server backend:8000; } upstream frontend { server frontend:3000; } # Rate limiting limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s; limit_req_zone $binary_remote_addr zone=chat:10m rate=5r/s; server { listen 80; server_name yourdomain.com; return 301 https://$server_name$request_uri; } server { listen 443 ssl http2; server_name yourdomain.com; ssl_certificate /etc/nginx/ssl/cert.pem; ssl_certificate_key /etc/nginx/ssl/key.pem; ssl_protocols TLSv1.2 TLSv1.3; ssl_ciphers HIGH:!aNULL:!MD5; # Frontend location / { proxy_pass http://frontend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } # API endpoints location /api/ { limit_req zone=api burst=20 nodelay; proxy_pass http://backend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE specific settings proxy_buffering off; proxy_cache off; proxy_set_header Connection ''; proxy_http_version 1.1; chunked_transfer_encoding off; } # Chat endpoint with stricter rate limiting location /api/chat { limit_req zone=chat burst=10 nodelay; proxy_pass http://backend; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE specific settings proxy_buffering off; proxy_cache off; proxy_read_timeout 300s; proxy_set_header Connection ''; proxy_http_version 1.1; chunked_transfer_encoding off; } } } ``` ## Monitoring and Observability ### Health Checks Configure comprehensive health checks: ```python # Enhanced health check endpoint @app.get("/health/detailed") async def detailed_health(): health_status = { "status": "healthy", "service": "agentic-rag", "version": "0.8.0", "timestamp": datetime.utcnow().isoformat(), "components": {} } # Database connectivity try: memory_manager = get_memory_manager() db_healthy = memory_manager.test_connection() health_status["components"]["database"] = { "status": "healthy" if db_healthy else "unhealthy", "type": "postgresql" } except Exception as e: health_status["components"]["database"] = { "status": "unhealthy", "error": str(e) } # LLM API connectivity try: config = get_config() # Test LLM connection health_status["components"]["llm"] = { "status": "healthy", "provider": config.provider } except Exception as e: health_status["components"]["llm"] = { "status": "unhealthy", "error": str(e) } # Overall status all_healthy = all( comp.get("status") == "healthy" for comp in health_status["components"].values() ) health_status["status"] = "healthy" if all_healthy else "degraded" return health_status ``` ### Logging Configuration ```yaml # logging.yaml version: 1 disable_existing_loggers: false formatters: standard: format: '%(asctime)s [%(levelname)s] %(name)s: %(message)s' json: format: '{"timestamp": "%(asctime)s", "level": "%(levelname)s", "logger": "%(name)s", "message": "%(message)s", "module": "%(module)s", "function": "%(funcName)s", "line": %(lineno)d}' handlers: console: class: logging.StreamHandler level: INFO formatter: standard stream: ext://sys.stdout file: class: logging.handlers.RotatingFileHandler level: INFO formatter: json filename: /app/logs/app.log maxBytes: 10485760 # 10MB backupCount: 5 loggers: service: level: INFO handlers: [console, file] propagate: false uvicorn: level: INFO handlers: [console] propagate: false root: level: INFO handlers: [console, file] ``` ### Metrics Collection ```python # metrics.py from prometheus_client import Counter, Histogram, Gauge, generate_latest # Metrics REQUEST_COUNT = Counter('http_requests_total', 'Total HTTP requests', ['method', 'endpoint']) REQUEST_DURATION = Histogram('http_request_duration_seconds', 'HTTP request duration') ACTIVE_SESSIONS = Gauge('active_sessions_total', 'Number of active chat sessions') TOOL_CALLS = Counter('tool_calls_total', 'Total tool calls', ['tool_name', 'status']) @app.middleware("http") async def metrics_middleware(request: Request, call_next): start_time = time.time() response = await call_next(request) duration = time.time() - start_time REQUEST_COUNT.labels( method=request.method, endpoint=request.url.path ).inc() REQUEST_DURATION.observe(duration) return response @app.get("/metrics") async def get_metrics(): return Response(generate_latest(), media_type="text/plain") ``` ## Security Configuration ### Environment Variables Security ```bash # Use a secrets management service in production export OPENAI_API_KEY=$(aws secretsmanager get-secret-value --secret-id openai-key --query SecretString --output text) export DATABASE_PASSWORD=$(azure keyvault secret show --vault-name MyKeyVault --name db-password --query value -o tsv) ``` ### Network Security ```yaml # docker-compose.prod.yml security additions services: backend: networks: - backend-network deploy: resources: limits: memory: 2G cpus: '1.0' reservations: memory: 1G cpus: '0.5' postgres: networks: - backend-network # Only accessible from backend, not exposed publicly networks: backend-network: driver: bridge internal: true # Internal network only ``` ### SSL/TLS Configuration ```bash # Generate SSL certificates with Let's Encrypt certbot certonly --webroot -w /var/www/html -d yourdomain.com # Or use existing certificates cp /path/to/your/cert.pem /etc/nginx/ssl/ cp /path/to/your/key.pem /etc/nginx/ssl/ ``` ## Deployment Checklist ### Pre-deployment - [ ] **Environment Variables**: All secrets configured in secure storage - [ ] **Database**: PostgreSQL instance created and accessible - [ ] **SSL Certificates**: Valid certificates for HTTPS - [ ] **Resource Limits**: CPU/memory limits configured - [ ] **Backup Strategy**: Database backup schedule configured ### Deployment - [ ] **Docker Images**: Built and pushed to registry - [ ] **Load Balancer**: Configured with health checks - [ ] **Database Migration**: Schema initialized - [ ] **Configuration**: Production config.yaml deployed - [ ] **Monitoring**: Health checks and metrics collection active ### Post-deployment - [ ] **Health Check**: All endpoints responding correctly - [ ] **Load Testing**: System performance under load verified - [ ] **Log Monitoring**: Error rates and performance logs reviewed - [ ] **Security Scan**: Vulnerability assessment completed - [ ] **Backup Verification**: Database backup/restore tested ## Troubleshooting Production Issues ### Common Deployment Issues **1. Database Connection Failures** ```bash # Check PostgreSQL connectivity psql -h your-db-host -U username -d database_name -c "SELECT 1;" # Verify connection string format echo $DATABASE_URL ``` **2. Container Health Check Failures** ```bash # Check container logs docker logs container-name # Test health endpoint manually curl -f http://localhost:8000/health ``` **3. SSL Certificate Issues** ```bash # Verify certificate validity openssl x509 -in /etc/nginx/ssl/cert.pem -text -noout # Check certificate expiration openssl x509 -in /etc/nginx/ssl/cert.pem -noout -dates ``` **4. High Memory Usage** ```bash # Monitor memory usage docker stats # Check for memory leaks docker exec -it container-name top ``` ### Performance Optimization ```yaml # Production optimizations in config.yaml app: memory_ttl_days: 3 # Reduce memory usage max_tool_loops: 3 # Limit computation postgresql: pool_size: 20 # Connection pooling max_overflow: 0 # Prevent connection leaks llm: rag: max_context_length: 32000 # Reduce context window if needed temperature: 0.1 # More deterministic responses ``` --- This deployment guide covers the essential aspects of running the Agentic RAG system in production. For specific cloud providers or deployment scenarios not covered here, consult the provider's documentation and adapt these configurations accordingly.