Files

Ye Shijie db0e5965ec init

2025-09-26 17:15:54 +08:00

21 KiB

Raw Blame History

Agentic RAG for Manufacturing Standards & Regulations

An advanced Agentic RAG (Retrieval-Augmented Generation) application that helps enterprises answer questions about manufacturing standards and regulations. The system combines LangGraph orchestration, streaming responses, and authoritative document retrieval to provide grounded answers with proper citations.

Overview

This project provides a complete AI-powered assistant solution for manufacturing standards and regulatory compliance queries. It features an autonomous agent workflow that can retrieve relevant information from multiple sources, synthesize comprehensive answers, and provide proper citations in real-time streaming responses.

The system consists of a FastAPI backend powered by LangGraph for agent orchestration, PostgreSQL for persistent session memory, and a modern Next.js frontend using assistant-ui components for an optimal user experience.

✨ Features

Core Capabilities

🤖 Multi-Intent Agentic Workflow: LangGraph v0.6-powered system with intelligent intent recognition and routing
🧠 Dual Agent System: Specialized agents for standards/regulations and user manual queries
📡 Real-time Streaming: Server-Sent Events (SSE) with token-by-token streaming and live tool execution updates
🔍 Advanced Retrieval System: Two-phase search strategy with metadata and content chunk retrieval
📚 Smart Citation Management: Automatic superscript citations [1] with dynamic source document mapping
💾 Persistent Memory: PostgreSQL-based session storage with 7-day TTL and intelligent conversation trimming
🎨 Modern Web UI: Next.js + assistant-ui components with responsive design and multi-language support

Intelligence Features

🎯 Intent Classification: Automatic routing between different knowledge domains (standards vs. user manuals)
🔄 Multi-Round Tool Execution: Autonomous multi-step reasoning with parallel tool execution
🔗 Context-Aware Retrieval: Query rewriting and enhancement based on conversation history
📊 Tool Progress Tracking: Real-time visual feedback for ongoing retrieval operations
🌍 Multi-Language Support: Browser language detection with URL parameter override

Technical Features

🔌 AI SDK Compatibility: Full support for AI SDK Data Stream Protocol and assistant-ui integration
🌐 Framework Agnostic: RESTful API design compatible with any frontend framework
🔒 Production Ready: Structured logging, comprehensive error handling, CORS support
🧪 Comprehensive Testing: Unit tests, integration tests, and streaming response validation
🚀 Easy Deployment: Docker support, environment-based configuration, health monitoring
⚡ Performance Optimized: Efficient PostgreSQL connection pooling and memory management

🏗️ Architecture

System Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Next.js Web   │    │   FastAPI        │    │  PostgreSQL     │
│  (assistant-ui)  │◄──►│   + LangGraph    │◄──►│  Session Store  │
│                 │    │   Backend        │    │                 │
└─────────────────┘    └──────────────────┘    └─────────────────┘
       │                        │                        │
       ▼                        ▼                        ▼
   User Interface         AI Agent Workflow         Persistent Memory
 - Thread Component     - Intent Recognition       - Conversation History
 - Tool UI Display      - Dual Agent System        - 7-day TTL
 - Streaming Updates    - Tool Orchestration       - Session Management
 - Citation Links       - Citation Generation      - Connection Pooling

Multi-Intent Agent Workflow

[User Query] → [Intent Recognition] → [Route Decision]
                        │                    │
                        ▼                    ▼
            [Standards/Regulation RAG]  [User Manual RAG]
                        │                    │
                        ▼                    ▼
         [Multi-Phase Retrieval]     [Manual Content Search]
                        │                    │
                        ▼                    ▼
              [Citation Generation]  [Direct Answer]
                        │                    │
                        └─────► [Post Process] ◄─────┘
                                     │
                                     ▼
                             [Streaming Response]

Enhanced Agent Workflow

The system now features a sophisticated multi-intent architecture:

Intent Recognition Node: Classifies user queries into appropriate domains
Standard/Regulation RAG Agent: Handles compliance and standards queries with two-phase retrieval
User Manual RAG Agent: Processes system usage and documentation queries
Post Processing Node: Formats final outputs with citations and tool summaries

Configuration Management

Dual Configuration:
- config.yaml: Core application settings (database, API, logging, retrieval endpoints)
- llm_prompt.yaml: LLM parameters and specialized prompt templates for each agent
Environment Variables: Sensitive settings loaded from environment with fallback defaults
Type Safety: Pydantic models for configuration validation and runtime checks

Tool System Architecture

Modular Design: Tool definitions in service/graph/tools.py and service/graph/user_manual_tools.py
Parallel Execution: Multiple tools execute concurrently via asyncio.gather for optimal performance
Schema Generation: Automatic tool schema generation for LLM function calling
Error Handling: Robust error handling with detailed logging and graceful degradation
Context Injection: Tools receive conversation context for enhanced query understanding

Key Components

🎯 Intent Recognition Node: Intelligent classification of user queries into appropriate knowledge domains
🤖 Standards/Regulation Agent: Autonomous agent with two-phase retrieval strategy and citation generation
📖 User Manual Agent: Specialized agent for system documentation and usage guidance queries
🔧 Advanced Retrieval Tools: HTTP wrappers for multiple search APIs with conversation context injection
📝 Post Processing Node: Formats final outputs with citations, tool summaries, and system disclaimers
💽 PostgreSQL Memory: Persistent session storage with connection pooling and automatic cleanup
📊 Streaming Response: AI SDK compatible SSE events with comprehensive tool progress tracking
🌍 Multi-Language UI: Browser language detection with URL parameter override and localized content

📁 Codebase Structure

agentic-rag-4/
├── 📋 config.yaml              # Main application configuration
├── 🎯 llm_prompt.yaml          # LLM parameters and prompt templates
├── 🐍 pyproject.toml           # Python dependencies and project metadata
├── ⚙️ Makefile                 # Build automation and development commands
└── 📜 scripts/                 # Service management scripts
    ├── start_service.sh        # Service startup script
    ├── stop_service.sh         # Service shutdown script
    └── port_manager.sh         # Port management utilities

Backend (Python/FastAPI/LangGraph):
├── 🔧 service/                 # Main backend service
    ├── main.py                 # FastAPI application entry point
    ├── config.py               # Configuration management
    ├── ai_sdk_chat.py          # AI SDK compatible chat endpoint
    ├── ai_sdk_adapter.py       # Data Stream Protocol adapter
    ├── llm_client.py           # LLM provider abstractions
    ├── sse.py                  # Server-Sent Events utilities
    ├── 🧠 graph/               # LangGraph agent workflow
    │   ├── graph.py            # Multi-intent agent workflow definition
    │   ├── state.py            # Agent state management
    │   ├── intent_recognition.py # Query intent classification
    │   ├── tools.py            # Standard/regulation retrieval tools
    │   ├── user_manual_rag.py  # User manual agent workflow
    │   ├── user_manual_tools.py # User manual retrieval tools
    │   └── message_trimmer.py  # Conversation context management
    ├── 💾 memory/              # Session memory implementations
    │   ├── postgresql_memory.py # PostgreSQL session persistence
    │   └── store.py            # Memory store abstractions
    ├── 🔍 retrieval/           # Information retrieval tools
    │   └── agentic_retrieval.py # Enhanced search tools with context
    ├── 📋 schemas/             # Data models and validation
    │   └── messages.py         # Chat message schemas
    └── 🛠️ utils/               # Shared utilities
        ├── logging.py          # Structured logging
        ├── templates.py        # Prompt templates
        └── error_handler.py    # Error handling utilities

Frontend (Next.js/React/assistant-ui):
├── 🌐 web/                     # Next.js web application
    ├── src/app/                # App router structure
    │   ├── page.tsx            # Main chat interface with multi-language support
    │   ├── layout.tsx          # Application layout and metadata
    │   ├── globals.css         # Global styles + assistant-ui theming
    │   └── api/                # API routes (Server-side)
    │       ├── chat/route.ts   # Chat API proxy to backend
    │       └── langgraph/      # LangGraph API proxy for assistant-ui
    ├── public/                 # Static assets
    │   ├── legal-document.png  # Standard/regulation tool icon
    │   ├── search.png          # Content search tool icon
    │   └── user-guide.png      # User manual tool icon
    ├── package.json            # Frontend dependencies
    ├── tailwind.config.ts      # Tailwind + assistant-ui configuration
    └── next.config.ts          # Next.js configuration

Testing & Documentation:
├── 🧪 tests/                   # Test suite
    ├── unit/                   # Unit tests
    └── integration/            # Integration and E2E tests
└── 📚 docs/                    # Documentation
    ├── CHANGELOG.md            # Version history and changes
    ├── deployment.md           # Deployment guide
    ├── development.md          # Development setup
    └── testing.md              # Testing guide

🚀 Quick Start

Prerequisites

Python 3.12+ - Required for backend service
Node.js 18+ - Required for frontend development
uv - Rust-based Python package manager (Install uv)
npm/pnpm - Node.js package manager
PostgreSQL - Database for session persistence (Azure Database for PostgreSQL recommended)
LLM API Access - OpenAI API key or Azure OpenAI credentials
Retrieval API Access - Access to the manufacturing standards retrieval service

1. Installation

# Clone the repository
git clone <repository-url>
cd agentic-rag-4

# Install all dependencies (backend + frontend)
make install

# Alternative: Install manually
uv sync              # Backend dependencies
cd web && npm install # Frontend dependencies

2. Configuration

The application uses two main configuration files:

# Copy and edit configuration files
cp config.yaml config.local.yaml          # Main app configuration
cp llm_prompt.yaml llm_prompt.local.yaml  # LLM settings and prompts

# Required environment variables
export OPENAI_API_KEY="your-openai-api-key"
export RETRIEVAL_API_KEY="your-retrieval-api-key"

# For Azure OpenAI (optional)
export AZURE_OPENAI_API_KEY="your-azure-key"

Edit config.yaml (Application Configuration):

app:
  name: agentic-rag
  max_tool_rounds: 3
  memory_ttl_days: 7
  port: 8000

provider: openai  # or "azure"

openai:
  api_key: "${OPENAI_API_KEY}"
  base_url: "https://api.openai.com/v1"
  model: "gpt-4o"

retrieval:
  endpoint: "your-retrieval-endpoint"
  api_key: "${RETRIEVAL_API_KEY}"

search:
  standard_regulation_index: "index-standards"
  chunk_index: "index-chunks"
  chunk_user_manual_index: "index-manuals"

postgresql:
  host: "localhost"
  database: "agent_memory"
  username: "your-username"
  password: "your-password"
  ttl_days: 7

citation:
  base_url: "https://your-citation-base-url"

Edit llm_prompt.yaml (LLM Parameters & Prompts):

parameters:
  temperature: 0
  max_context_length: 100000

prompts:
  agent_system_prompt: |
    You are an Agentic RAG assistant for the CATOnline system...
    # Custom agent prompt for standards/regulations
    
  intent_recognition_system_prompt: |
    You are an intent classifier for the CATOnline system...
    # Intent classification prompt
    
  user_manual_system_prompt: |  
    You are a specialized assistant for CATOnline user manual queries...
    # User manual assistant prompt

3. Development Mode (Recommended)

# Option 1: Start both services simultaneously
make dev

# Option 2: Start services separately
make dev-backend     # Backend with auto-reload
make dev-web         # Frontend development server

# Check service status
make status
make health

Service URLs:

Backend API: http://localhost:8000
Frontend: http://localhost:3000
API Docs: http://localhost:8000/docs

4. Production Mode

# Start backend service
make start          # Foreground mode
make start-bg       # Background mode

# Stop service
make stop

# Restart service
make restart

# Build and serve frontend
cd web
npm run build
npm start

5. Testing & Validation

# Run all tests
make test

# Run specific test suites
make test-unit           # Unit tests
make test-integration    # Integration tests
make test-e2e           # End-to-end tests

# Check service health
make health

# View service logs
make logs

📡 API Reference

Chat Endpoints

Primary Chat API (SSE Format)

POST /api/chat

Traditional Server-Sent Events format for custom integrations:

{
  "session_id": "session_abc123_1640995200000",
  "messages": [
    {"role": "user", "content": "What are the vehicle safety testing standards for electric vehicles?"}
  ],
  "client_hints": {}
}

AI SDK Compatible API (Data Stream Protocol)

POST /api/ai-sdk/chat

Compatible with AI SDK and assistant-ui frontend:

{
  "messages": [
    {"role": "user", "content": "What are the vehicle safety testing standards for electric vehicles?"}
  ],
  "session_id": "session_abc123_1640995200000",
  "metadata": {
    "source": "assistant-ui",
    "version": "0.11.0",
    "timestamp": "2025-01-01T12:00:00Z"
  }
}

Response Format

SSE Events (/api/chat):

event: tool_start
data: {"id":"tool_123","name":"retrieve_standard_regulation","args":{"query":"vehicle safety testing standards electric vehicles"}}

event: tokens  
data: {"delta":"Based on the retrieved standards","tool_call_id":null}

event: tool_result
data: {"id":"tool_123","name":"retrieve_standard_regulation","results":[...],"took_ms":234}

event: agent_done
data: {"answer_done":true}

event: post_append_1
data: {"answer":"Vehicle safety testing for electric vehicles [1] involves...","citations_mapping_csv":"1,SRC-ISO26262\n2,SRC-UN38.3"}

Data Stream Protocol (/api/ai-sdk/chat):

0:{"id":"msg_001","role":"assistant","content":[{"type":"text","text":"Based on the retrieved standards"}]}
1:{"type":"tool_call","tool_call_id":"tool_123","name":"retrieve_standard_regulation","args":{"query":"vehicle safety testing"}}
2:{"type":"tool_result","tool_call_id":"tool_123","result":{"results":[...],"took_ms":234}}

Utility Endpoints

Health Check

GET /health

{
  "status": "healthy", 
  "service": "agentic-rag"
}

API Information

GET /

{
  "message": "Agentic RAG API for Manufacturing Standards & Regulations"
}

Available Tools

The system provides specialized tools for different knowledge domains:

Standards & Regulations Tools

retrieve_standard_regulation - Search standard/regulation metadata and attributes
retrieve_doc_chunk_standard_regulation - Search document content chunks

User Manual Tools

retrieve_system_usermanual - Search CATOnline system documentation and user guides

Parameter	Type	Required	Description
`query`	string	✅	Search query text
`conversation_history`	string	❌	Previous conversation context
`top_k`	integer	❌	Maximum results (default: 10)
`score_threshold`	float	❌	Minimum relevance score
`gen_rerank`	boolean	❌	Enable reranking (default: true)

Event Types Reference

Event Type	Data Fields	Description
`tokens`	`delta`, `tool_call_id`	LLM token stream
`tool_start`	`id`, `name`, `args`	Tool execution begins
`tool_result`	`id`, `name`, `results`, `took_ms`	Tool execution complete
`tool_error`	`id`, `name`, `error`	Tool execution failed
`agent_done`	`answer_done`	Agent processing complete
`intent_classification`	`intent`, `confidence`	Query intent classification result
`citations`	`citations_list`	Final formatted citation list
`tool_summary`	`summary`	Tool execution summary
`error`	`error`, `details`	System error occurred

Multi-Intent Workflow Events

The system now supports intent-based routing with specialized event streams:

Standards/Regulation Queries: Full tool execution with citation generation
User Manual Queries: Streamlined documentation search with direct answers
Intent Classification: Real-time feedback on query routing decisions

🧠 Multi-Intent System

The application features an intelligent intent recognition system that automatically routes user queries to specialized agents:

Intent Classification

The system analyzes user queries and conversation context to determine the appropriate processing path:

Standard_Regulation_RAG: For compliance, standards, and regulatory queries
- Two-phase retrieval strategy (metadata → content chunks)
- Enhanced citation generation with document linking
- Multi-round tool execution for comprehensive answers
User_Manual_RAG: For system documentation and usage questions
- Direct documentation search and retrieval
- Streamlined processing for faster responses
- Context-aware help and guidance

Query Examples

Standards/Regulation Queries:

"最新的电动汽车锂电池标准？" (Latest lithium battery standards for electric vehicles?)
"如何测试电动汽车的充电性能？" (How to test electric vehicle charging performance?)
"提供关于车辆通讯安全的法规" (Provide vehicle communication security regulations)

User Manual Queries:

"How do I use CATOnline system?"
"What are the search features available?"
"How to export search results?"

Enhanced Features

Context Preservation: Session memory maintained across intent switches
Language Detection: Automatic language handling for Chinese/English queries
Visual Feedback: Real-time UI updates showing intent classification and tool progress
Error Recovery: Graceful handling of classification uncertainties

📚 Documentation

For detailed information, see the documentation in the docs/ directory:

📋 Deployment Guide - Production deployment instructions
💻 Development Guide - Development setup and guidelines
🧪 Testing Guide - Testing procedures and best practices
📝 Changelog - Version history and release notes

🤝 Contributing

We welcome contributions! Please see our Development Guide for details on:

Setting up the development environment
Code style and formatting guidelines
Running tests and quality checks
Submitting pull requests

Quick Contribution Setup

# Fork the repository and clone your fork
git clone https://github.com/your-username/agentic-rag-4.git
cd agentic-rag-4

# Install development dependencies
make install
uv sync --dev

# Run tests to ensure everything works
make test

# Create a feature branch
git checkout -b feature/amazing-feature

# Make your changes and test
make test
make lint

# Commit and push
git commit -m "Add amazing feature"
git push origin feature/amazing-feature

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙋‍♀️ Support

📖 Documentation: Check this README and the docs/ directory
🐛 Issues: Open a GitHub issue for bugs or feature requests
💬 Discussions: Use GitHub Discussions for questions

Built with ❤️ using FastAPI, LangGraph, Next.js, and assistant-ui

21 KiB Raw Blame History