556 lines
21 KiB
Markdown
556 lines
21 KiB
Markdown
# Agentic RAG for Manufacturing Standards & Regulations
|
|
|
|
An advanced Agentic RAG (Retrieval-Augmented Generation) application that helps enterprises answer questions about manufacturing standards and regulations. The system combines LangGraph orchestration, streaming responses, and authoritative document retrieval to provide grounded answers with proper citations.
|
|
|
|
## Overview
|
|
|
|
This project provides a complete AI-powered assistant solution for manufacturing standards and regulatory compliance queries. It features an autonomous agent workflow that can retrieve relevant information from multiple sources, synthesize comprehensive answers, and provide proper citations in real-time streaming responses.
|
|
|
|
The system consists of a FastAPI backend powered by LangGraph for agent orchestration, PostgreSQL for persistent session memory, and a modern Next.js frontend using assistant-ui components for an optimal user experience.
|
|
|
|
## ✨ Features
|
|
|
|
### Core Capabilities
|
|
- **🤖 Multi-Intent Agentic Workflow**: LangGraph v0.6-powered system with intelligent intent recognition and routing
|
|
- **🧠 Dual Agent System**: Specialized agents for standards/regulations and user manual queries
|
|
- **📡 Real-time Streaming**: Server-Sent Events (SSE) with token-by-token streaming and live tool execution updates
|
|
- **🔍 Advanced Retrieval System**: Two-phase search strategy with metadata and content chunk retrieval
|
|
- **📚 Smart Citation Management**: Automatic superscript citations [1] with dynamic source document mapping
|
|
- **💾 Persistent Memory**: PostgreSQL-based session storage with 7-day TTL and intelligent conversation trimming
|
|
- **🎨 Modern Web UI**: Next.js + assistant-ui components with responsive design and multi-language support
|
|
|
|
### Intelligence Features
|
|
- **🎯 Intent Classification**: Automatic routing between different knowledge domains (standards vs. user manuals)
|
|
- **🔄 Multi-Round Tool Execution**: Autonomous multi-step reasoning with parallel tool execution
|
|
- **🔗 Context-Aware Retrieval**: Query rewriting and enhancement based on conversation history
|
|
- **📊 Tool Progress Tracking**: Real-time visual feedback for ongoing retrieval operations
|
|
- **🌍 Multi-Language Support**: Browser language detection with URL parameter override
|
|
|
|
### Technical Features
|
|
- **🔌 AI SDK Compatibility**: Full support for AI SDK Data Stream Protocol and assistant-ui integration
|
|
- **🌐 Framework Agnostic**: RESTful API design compatible with any frontend framework
|
|
- **🔒 Production Ready**: Structured logging, comprehensive error handling, CORS support
|
|
- **🧪 Comprehensive Testing**: Unit tests, integration tests, and streaming response validation
|
|
- **🚀 Easy Deployment**: Docker support, environment-based configuration, health monitoring
|
|
- **⚡ Performance Optimized**: Efficient PostgreSQL connection pooling and memory management
|
|
|
|
## 🏗️ Architecture
|
|
|
|
### System Architecture
|
|
|
|
```
|
|
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
|
│ Next.js Web │ │ FastAPI │ │ PostgreSQL │
|
|
│ (assistant-ui) │◄──►│ + LangGraph │◄──►│ Session Store │
|
|
│ │ │ Backend │ │ │
|
|
└─────────────────┘ └──────────────────┘ └─────────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
User Interface AI Agent Workflow Persistent Memory
|
|
- Thread Component - Intent Recognition - Conversation History
|
|
- Tool UI Display - Dual Agent System - 7-day TTL
|
|
- Streaming Updates - Tool Orchestration - Session Management
|
|
- Citation Links - Citation Generation - Connection Pooling
|
|
```
|
|
|
|
### Multi-Intent Agent Workflow
|
|
|
|
```
|
|
[User Query] → [Intent Recognition] → [Route Decision]
|
|
│ │
|
|
▼ ▼
|
|
[Standards/Regulation RAG] [User Manual RAG]
|
|
│ │
|
|
▼ ▼
|
|
[Multi-Phase Retrieval] [Manual Content Search]
|
|
│ │
|
|
▼ ▼
|
|
[Citation Generation] [Direct Answer]
|
|
│ │
|
|
└─────► [Post Process] ◄─────┘
|
|
│
|
|
▼
|
|
[Streaming Response]
|
|
```
|
|
|
|
### Enhanced Agent Workflow
|
|
|
|
The system now features a sophisticated multi-intent architecture:
|
|
|
|
1. **Intent Recognition Node**: Classifies user queries into appropriate domains
|
|
2. **Standard/Regulation RAG Agent**: Handles compliance and standards queries with two-phase retrieval
|
|
3. **User Manual RAG Agent**: Processes system usage and documentation queries
|
|
4. **Post Processing Node**: Formats final outputs with citations and tool summaries
|
|
|
|
### Configuration Management
|
|
- **Dual Configuration**:
|
|
- `config.yaml`: Core application settings (database, API, logging, retrieval endpoints)
|
|
- `llm_prompt.yaml`: LLM parameters and specialized prompt templates for each agent
|
|
- **Environment Variables**: Sensitive settings loaded from environment with fallback defaults
|
|
- **Type Safety**: Pydantic models for configuration validation and runtime checks
|
|
|
|
### Tool System Architecture
|
|
- **Modular Design**: Tool definitions in `service/graph/tools.py` and `service/graph/user_manual_tools.py`
|
|
- **Parallel Execution**: Multiple tools execute concurrently via `asyncio.gather` for optimal performance
|
|
- **Schema Generation**: Automatic tool schema generation for LLM function calling
|
|
- **Error Handling**: Robust error handling with detailed logging and graceful degradation
|
|
- **Context Injection**: Tools receive conversation context for enhanced query understanding
|
|
|
|
### Key Components
|
|
|
|
- **🎯 Intent Recognition Node**: Intelligent classification of user queries into appropriate knowledge domains
|
|
- **🤖 Standards/Regulation Agent**: Autonomous agent with two-phase retrieval strategy and citation generation
|
|
- **📖 User Manual Agent**: Specialized agent for system documentation and usage guidance queries
|
|
- **🔧 Advanced Retrieval Tools**: HTTP wrappers for multiple search APIs with conversation context injection
|
|
- **📝 Post Processing Node**: Formats final outputs with citations, tool summaries, and system disclaimers
|
|
- **💽 PostgreSQL Memory**: Persistent session storage with connection pooling and automatic cleanup
|
|
- **📊 Streaming Response**: AI SDK compatible SSE events with comprehensive tool progress tracking
|
|
- **🌍 Multi-Language UI**: Browser language detection with URL parameter override and localized content
|
|
|
|
## 📁 Codebase Structure
|
|
|
|
```
|
|
agentic-rag-4/
|
|
├── 📋 config.yaml # Main application configuration
|
|
├── 🎯 llm_prompt.yaml # LLM parameters and prompt templates
|
|
├── 🐍 pyproject.toml # Python dependencies and project metadata
|
|
├── ⚙️ Makefile # Build automation and development commands
|
|
└── 📜 scripts/ # Service management scripts
|
|
├── start_service.sh # Service startup script
|
|
├── stop_service.sh # Service shutdown script
|
|
└── port_manager.sh # Port management utilities
|
|
|
|
Backend (Python/FastAPI/LangGraph):
|
|
├── 🔧 service/ # Main backend service
|
|
├── main.py # FastAPI application entry point
|
|
├── config.py # Configuration management
|
|
├── ai_sdk_chat.py # AI SDK compatible chat endpoint
|
|
├── ai_sdk_adapter.py # Data Stream Protocol adapter
|
|
├── llm_client.py # LLM provider abstractions
|
|
├── sse.py # Server-Sent Events utilities
|
|
├── 🧠 graph/ # LangGraph agent workflow
|
|
│ ├── graph.py # Multi-intent agent workflow definition
|
|
│ ├── state.py # Agent state management
|
|
│ ├── intent_recognition.py # Query intent classification
|
|
│ ├── tools.py # Standard/regulation retrieval tools
|
|
│ ├── user_manual_rag.py # User manual agent workflow
|
|
│ ├── user_manual_tools.py # User manual retrieval tools
|
|
│ └── message_trimmer.py # Conversation context management
|
|
├── 💾 memory/ # Session memory implementations
|
|
│ ├── postgresql_memory.py # PostgreSQL session persistence
|
|
│ └── store.py # Memory store abstractions
|
|
├── 🔍 retrieval/ # Information retrieval tools
|
|
│ └── agentic_retrieval.py # Enhanced search tools with context
|
|
├── 📋 schemas/ # Data models and validation
|
|
│ └── messages.py # Chat message schemas
|
|
└── 🛠️ utils/ # Shared utilities
|
|
├── logging.py # Structured logging
|
|
├── templates.py # Prompt templates
|
|
└── error_handler.py # Error handling utilities
|
|
|
|
Frontend (Next.js/React/assistant-ui):
|
|
├── 🌐 web/ # Next.js web application
|
|
├── src/app/ # App router structure
|
|
│ ├── page.tsx # Main chat interface with multi-language support
|
|
│ ├── layout.tsx # Application layout and metadata
|
|
│ ├── globals.css # Global styles + assistant-ui theming
|
|
│ └── api/ # API routes (Server-side)
|
|
│ ├── chat/route.ts # Chat API proxy to backend
|
|
│ └── langgraph/ # LangGraph API proxy for assistant-ui
|
|
├── public/ # Static assets
|
|
│ ├── legal-document.png # Standard/regulation tool icon
|
|
│ ├── search.png # Content search tool icon
|
|
│ └── user-guide.png # User manual tool icon
|
|
├── package.json # Frontend dependencies
|
|
├── tailwind.config.ts # Tailwind + assistant-ui configuration
|
|
└── next.config.ts # Next.js configuration
|
|
|
|
Testing & Documentation:
|
|
├── 🧪 tests/ # Test suite
|
|
├── unit/ # Unit tests
|
|
└── integration/ # Integration and E2E tests
|
|
└── 📚 docs/ # Documentation
|
|
├── CHANGELOG.md # Version history and changes
|
|
├── deployment.md # Deployment guide
|
|
├── development.md # Development setup
|
|
└── testing.md # Testing guide
|
|
```
|
|
|
|
## 🚀 Quick Start
|
|
|
|
### Prerequisites
|
|
|
|
- **Python 3.12+** - Required for backend service
|
|
- **Node.js 18+** - Required for frontend development
|
|
- **uv** - Rust-based Python package manager ([Install uv](https://github.com/astral-sh/uv))
|
|
- **npm/pnpm** - Node.js package manager
|
|
- **PostgreSQL** - Database for session persistence (Azure Database for PostgreSQL recommended)
|
|
- **LLM API Access** - OpenAI API key or Azure OpenAI credentials
|
|
- **Retrieval API Access** - Access to the manufacturing standards retrieval service
|
|
|
|
### 1. Installation
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone <repository-url>
|
|
cd agentic-rag-4
|
|
|
|
# Install all dependencies (backend + frontend)
|
|
make install
|
|
|
|
# Alternative: Install manually
|
|
uv sync # Backend dependencies
|
|
cd web && npm install # Frontend dependencies
|
|
```
|
|
|
|
### 2. Configuration
|
|
|
|
The application uses two main configuration files:
|
|
|
|
```bash
|
|
# Copy and edit configuration files
|
|
cp config.yaml config.local.yaml # Main app configuration
|
|
cp llm_prompt.yaml llm_prompt.local.yaml # LLM settings and prompts
|
|
|
|
# Required environment variables
|
|
export OPENAI_API_KEY="your-openai-api-key"
|
|
export RETRIEVAL_API_KEY="your-retrieval-api-key"
|
|
|
|
# For Azure OpenAI (optional)
|
|
export AZURE_OPENAI_API_KEY="your-azure-key"
|
|
```
|
|
|
|
**Edit `config.yaml` (Application Configuration)**:
|
|
```yaml
|
|
app:
|
|
name: agentic-rag
|
|
max_tool_rounds: 3
|
|
memory_ttl_days: 7
|
|
port: 8000
|
|
|
|
provider: openai # or "azure"
|
|
|
|
openai:
|
|
api_key: "${OPENAI_API_KEY}"
|
|
base_url: "https://api.openai.com/v1"
|
|
model: "gpt-4o"
|
|
|
|
retrieval:
|
|
endpoint: "your-retrieval-endpoint"
|
|
api_key: "${RETRIEVAL_API_KEY}"
|
|
|
|
search:
|
|
standard_regulation_index: "index-standards"
|
|
chunk_index: "index-chunks"
|
|
chunk_user_manual_index: "index-manuals"
|
|
|
|
postgresql:
|
|
host: "localhost"
|
|
database: "agent_memory"
|
|
username: "your-username"
|
|
password: "your-password"
|
|
ttl_days: 7
|
|
|
|
citation:
|
|
base_url: "https://your-citation-base-url"
|
|
```
|
|
|
|
**Edit `llm_prompt.yaml` (LLM Parameters & Prompts)**:
|
|
```yaml
|
|
parameters:
|
|
temperature: 0
|
|
max_context_length: 100000
|
|
|
|
prompts:
|
|
agent_system_prompt: |
|
|
You are an Agentic RAG assistant for the CATOnline system...
|
|
# Custom agent prompt for standards/regulations
|
|
|
|
intent_recognition_system_prompt: |
|
|
You are an intent classifier for the CATOnline system...
|
|
# Intent classification prompt
|
|
|
|
user_manual_system_prompt: |
|
|
You are a specialized assistant for CATOnline user manual queries...
|
|
# User manual assistant prompt
|
|
```
|
|
|
|
### 3. Development Mode (Recommended)
|
|
|
|
```bash
|
|
# Option 1: Start both services simultaneously
|
|
make dev
|
|
|
|
# Option 2: Start services separately
|
|
make dev-backend # Backend with auto-reload
|
|
make dev-web # Frontend development server
|
|
|
|
# Check service status
|
|
make status
|
|
make health
|
|
```
|
|
|
|
**Service URLs:**
|
|
- **Backend API**: http://localhost:8000
|
|
- **Frontend**: http://localhost:3000
|
|
- **API Docs**: http://localhost:8000/docs
|
|
|
|
### 4. Production Mode
|
|
|
|
```bash
|
|
# Start backend service
|
|
make start # Foreground mode
|
|
make start-bg # Background mode
|
|
|
|
# Stop service
|
|
make stop
|
|
|
|
# Restart service
|
|
make restart
|
|
|
|
# Build and serve frontend
|
|
cd web
|
|
npm run build
|
|
npm start
|
|
```
|
|
|
|
### 5. Testing & Validation
|
|
|
|
```bash
|
|
# Run all tests
|
|
make test
|
|
|
|
# Run specific test suites
|
|
make test-unit # Unit tests
|
|
make test-integration # Integration tests
|
|
make test-e2e # End-to-end tests
|
|
|
|
# Check service health
|
|
make health
|
|
|
|
# View service logs
|
|
make logs
|
|
```
|
|
|
|
## 📡 API Reference
|
|
|
|
### Chat Endpoints
|
|
|
|
#### Primary Chat API (SSE Format)
|
|
**POST** `/api/chat`
|
|
|
|
Traditional Server-Sent Events format for custom integrations:
|
|
|
|
```json
|
|
{
|
|
"session_id": "session_abc123_1640995200000",
|
|
"messages": [
|
|
{"role": "user", "content": "What are the vehicle safety testing standards for electric vehicles?"}
|
|
],
|
|
"client_hints": {}
|
|
}
|
|
```
|
|
|
|
#### AI SDK Compatible API (Data Stream Protocol)
|
|
**POST** `/api/ai-sdk/chat`
|
|
|
|
Compatible with AI SDK and assistant-ui frontend:
|
|
|
|
```json
|
|
{
|
|
"messages": [
|
|
{"role": "user", "content": "What are the vehicle safety testing standards for electric vehicles?"}
|
|
],
|
|
"session_id": "session_abc123_1640995200000",
|
|
"metadata": {
|
|
"source": "assistant-ui",
|
|
"version": "0.11.0",
|
|
"timestamp": "2025-01-01T12:00:00Z"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Response Format
|
|
|
|
**SSE Events (`/api/chat`)**:
|
|
```
|
|
event: tool_start
|
|
data: {"id":"tool_123","name":"retrieve_standard_regulation","args":{"query":"vehicle safety testing standards electric vehicles"}}
|
|
|
|
event: tokens
|
|
data: {"delta":"Based on the retrieved standards","tool_call_id":null}
|
|
|
|
event: tool_result
|
|
data: {"id":"tool_123","name":"retrieve_standard_regulation","results":[...],"took_ms":234}
|
|
|
|
event: agent_done
|
|
data: {"answer_done":true}
|
|
|
|
event: post_append_1
|
|
data: {"answer":"Vehicle safety testing for electric vehicles [1] involves...","citations_mapping_csv":"1,SRC-ISO26262\n2,SRC-UN38.3"}
|
|
```
|
|
|
|
**Data Stream Protocol (`/api/ai-sdk/chat`)**:
|
|
```
|
|
0:{"id":"msg_001","role":"assistant","content":[{"type":"text","text":"Based on the retrieved standards"}]}
|
|
1:{"type":"tool_call","tool_call_id":"tool_123","name":"retrieve_standard_regulation","args":{"query":"vehicle safety testing"}}
|
|
2:{"type":"tool_result","tool_call_id":"tool_123","result":{"results":[...],"took_ms":234}}
|
|
```
|
|
|
|
### Utility Endpoints
|
|
|
|
#### Health Check
|
|
**GET** `/health`
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"service": "agentic-rag"
|
|
}
|
|
```
|
|
|
|
#### API Information
|
|
**GET** `/`
|
|
```json
|
|
{
|
|
"message": "Agentic RAG API for Manufacturing Standards & Regulations"
|
|
}
|
|
```
|
|
|
|
### Available Tools
|
|
|
|
The system provides specialized tools for different knowledge domains:
|
|
|
|
#### Standards & Regulations Tools
|
|
1. **`retrieve_standard_regulation`** - Search standard/regulation metadata and attributes
|
|
2. **`retrieve_doc_chunk_standard_regulation`** - Search document content chunks
|
|
|
|
#### User Manual Tools
|
|
3. **`retrieve_system_usermanual`** - Search CATOnline system documentation and user guides
|
|
|
|
| Parameter | Type | Required | Description |
|
|
|-----------|------|----------|-------------|
|
|
| `query` | string | ✅ | Search query text |
|
|
| `conversation_history` | string | ❌ | Previous conversation context |
|
|
| `top_k` | integer | ❌ | Maximum results (default: 10) |
|
|
| `score_threshold` | float | ❌ | Minimum relevance score |
|
|
| `gen_rerank` | boolean | ❌ | Enable reranking (default: true) |
|
|
|
|
### Event Types Reference
|
|
|
|
| Event Type | Data Fields | Description |
|
|
|------------|-------------|-------------|
|
|
| `tokens` | `delta`, `tool_call_id` | LLM token stream |
|
|
| `tool_start` | `id`, `name`, `args` | Tool execution begins |
|
|
| `tool_result` | `id`, `name`, `results`, `took_ms` | Tool execution complete |
|
|
| `tool_error` | `id`, `name`, `error` | Tool execution failed |
|
|
| `agent_done` | `answer_done` | Agent processing complete |
|
|
| `intent_classification` | `intent`, `confidence` | Query intent classification result |
|
|
| `citations` | `citations_list` | Final formatted citation list |
|
|
| `tool_summary` | `summary` | Tool execution summary |
|
|
| `error` | `error`, `details` | System error occurred |
|
|
|
|
### Multi-Intent Workflow Events
|
|
|
|
The system now supports intent-based routing with specialized event streams:
|
|
|
|
- **Standards/Regulation Queries**: Full tool execution with citation generation
|
|
- **User Manual Queries**: Streamlined documentation search with direct answers
|
|
- **Intent Classification**: Real-time feedback on query routing decisions
|
|
|
|
## 🧠 Multi-Intent System
|
|
|
|
The application features an intelligent intent recognition system that automatically routes user queries to specialized agents:
|
|
|
|
### Intent Classification
|
|
|
|
The system analyzes user queries and conversation context to determine the appropriate processing path:
|
|
|
|
1. **Standard_Regulation_RAG**: For compliance, standards, and regulatory queries
|
|
- Two-phase retrieval strategy (metadata → content chunks)
|
|
- Enhanced citation generation with document linking
|
|
- Multi-round tool execution for comprehensive answers
|
|
|
|
2. **User_Manual_RAG**: For system documentation and usage questions
|
|
- Direct documentation search and retrieval
|
|
- Streamlined processing for faster responses
|
|
- Context-aware help and guidance
|
|
|
|
### Query Examples
|
|
|
|
**Standards/Regulation Queries:**
|
|
- "最新的电动汽车锂电池标准?" (Latest lithium battery standards for electric vehicles?)
|
|
- "如何测试电动汽车的充电性能?" (How to test electric vehicle charging performance?)
|
|
- "提供关于车辆通讯安全的法规" (Provide vehicle communication security regulations)
|
|
|
|
**User Manual Queries:**
|
|
- "How do I use CATOnline system?"
|
|
- "What are the search features available?"
|
|
- "How to export search results?"
|
|
|
|
### Enhanced Features
|
|
|
|
- **Context Preservation**: Session memory maintained across intent switches
|
|
- **Language Detection**: Automatic language handling for Chinese/English queries
|
|
- **Visual Feedback**: Real-time UI updates showing intent classification and tool progress
|
|
- **Error Recovery**: Graceful handling of classification uncertainties
|
|
|
|
---
|
|
|
|
## 📚 Documentation
|
|
|
|
For detailed information, see the documentation in the `docs/` directory:
|
|
|
|
- **[📋 Deployment Guide](docs/deployment.md)** - Production deployment instructions
|
|
- **[💻 Development Guide](docs/development.md)** - Development setup and guidelines
|
|
- **[🧪 Testing Guide](docs/testing.md)** - Testing procedures and best practices
|
|
- **[📝 Changelog](docs/CHANGELOG.md)** - Version history and release notes
|
|
|
|
## 🤝 Contributing
|
|
|
|
We welcome contributions! Please see our [Development Guide](docs/development.md) for details on:
|
|
|
|
- Setting up the development environment
|
|
- Code style and formatting guidelines
|
|
- Running tests and quality checks
|
|
- Submitting pull requests
|
|
|
|
### Quick Contribution Setup
|
|
|
|
```bash
|
|
# Fork the repository and clone your fork
|
|
git clone https://github.com/your-username/agentic-rag-4.git
|
|
cd agentic-rag-4
|
|
|
|
# Install development dependencies
|
|
make install
|
|
uv sync --dev
|
|
|
|
# Run tests to ensure everything works
|
|
make test
|
|
|
|
# Create a feature branch
|
|
git checkout -b feature/amazing-feature
|
|
|
|
# Make your changes and test
|
|
make test
|
|
make lint
|
|
|
|
# Commit and push
|
|
git commit -m "Add amazing feature"
|
|
git push origin feature/amazing-feature
|
|
```
|
|
|
|
## 📄 License
|
|
|
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
|
|
## 🙋♀️ Support
|
|
|
|
- **📖 Documentation**: Check this README and the `docs/` directory
|
|
- **🐛 Issues**: [Open a GitHub issue](https://github.com/your-repo/issues) for bugs or feature requests
|
|
- **💬 Discussions**: Use [GitHub Discussions](https://github.com/your-repo/discussions) for questions
|
|
|
|
---
|
|
|
|
**Built with ❤️ using FastAPI, LangGraph, Next.js, and assistant-ui**
|