3086 lines
161 KiB
Markdown
3086 lines
161 KiB
Markdown
# Changelog
|
||
|
||
## v1.2.8 - Enhanced Agentic Workflow and Citation Management Documentation - Thu Sep 12 2025
|
||
|
||
### 📋 **Documentation** *(Design Document Enhancement)*
|
||
|
||
**Enhanced the system design documentation with detailed coverage of Agentic Workflow features and advanced citation management capabilities.**
|
||
|
||
#### Changes Made:
|
||
|
||
**1. Agentic Workflow Features Enhancement**:
|
||
- **Enhanced**: Agentic Workflow Features Demonstrated section with comprehensive query rewriting/decomposition coverage
|
||
- **Added**: Detailed "Query Rewriting/Decomposition in Agentic Workflow" section highlighting core intelligence features
|
||
- **Added**: "Citation Management in Agentic Workflow" section documenting advanced citation capabilities
|
||
- **Updated**: Workflow diagrams to explicitly show query rewriting and citation processing flows
|
||
|
||
**2. Citation Management Documentation**:
|
||
- **Enhanced**: Citation tracking and management documentation with controllable citation lists and links
|
||
- **Added**: Detailed citation processing workflow with real-time capture and quality validation
|
||
- **Updated**: Tool system architecture to show query processing pipeline integration
|
||
- **Added**: Multi-round citation coherence and cross-tool citation integration documentation
|
||
|
||
**3. Technical Architecture Updates**:
|
||
- **Updated**: Sequence diagrams to show query rewriter components and parallel execution
|
||
- **Enhanced**: Tool system architecture with query processing strategies
|
||
- **Added**: Domain-specific intelligence documentation for different query types
|
||
- **Updated**: Cross-agent learning documentation with advanced agentic intelligence features
|
||
|
||
**4. Design Principles Refinement**:
|
||
- **Updated**: Core feature list to highlight controllable citation management
|
||
- **Enhanced**: Query processing integration documentation
|
||
- **Added**: Strategic citation assignment and post-processing enhancement details
|
||
- **Updated**: System benefits documentation to reflect enhanced capabilities
|
||
|
||
---
|
||
|
||
## v1.2.7 - Comprehensive System Design Documentation - Tue Sep 10 2025
|
||
|
||
### 📋 **Documentation** *(System Architecture & Design Documentation)*
|
||
|
||
**Created comprehensive system design documentation with detailed architectural diagrams and design explanations.**
|
||
|
||
#### Changes Made:
|
||
|
||
**1. System Design Document Creation**:
|
||
- **Created**: `docs/design.md` - Complete architectural design documentation
|
||
- **Architecture Diagrams**: 15+ mermaid diagrams covering all system aspects
|
||
- **Design Explanations**: Detailed design principles and implementation rationale
|
||
- **Comprehensive Coverage**: All system layers from frontend to infrastructure
|
||
|
||
**2. Architecture Documentation**:
|
||
- **High-Level Architecture**: Multi-layer system overview with component relationships
|
||
- **Component Architecture**: Detailed breakdown of frontend, backend, and agent components
|
||
- **Workflow Design**: Multi-intent agent workflows and two-phase retrieval strategy
|
||
- **Data Flow Architecture**: Request-response flows and streaming data patterns
|
||
|
||
**3. Feature & System Documentation**:
|
||
- **Feature Architecture**: Core capabilities and tool system design
|
||
- **Memory Management**: PostgreSQL-based session persistence architecture
|
||
- **Configuration Architecture**: Layered configuration management approach
|
||
- **Security Architecture**: Multi-layered security implementation
|
||
|
||
**4. Deployment & Performance Documentation**:
|
||
- **Deployment Architecture**: Production deployment patterns and container architecture
|
||
- **Performance Architecture**: Optimization strategies across all system layers
|
||
- **Technology Stack**: Complete technology selection rationale and integration
|
||
- **Future Enhancements**: Roadmap and enhancement strategy
|
||
|
||
#### Documentation Features:
|
||
|
||
**Visual Architecture**:
|
||
- **15+ Mermaid Diagrams**: Comprehensive visual representation of system architecture
|
||
- **Component Relationships**: Clear visualization of component interactions
|
||
- **Data Flow Patterns**: Detailed request-response and streaming flow diagrams
|
||
- **Deployment Topology**: Production deployment and scaling architecture
|
||
|
||
**Design Explanations**:
|
||
- **Design Philosophy**: Core principles driving architectural decisions
|
||
- **Implementation Rationale**: Detailed explanation of design choices
|
||
- **Best Practices**: Production-ready patterns and recommendations
|
||
- **Performance Considerations**: Optimization strategies and trade-offs
|
||
|
||
**Comprehensive Coverage**:
|
||
- **Frontend Architecture**: Next.js, React, and assistant-ui integration
|
||
- **Backend Architecture**: FastAPI, LangGraph, and agent orchestration
|
||
- **Data Architecture**: PostgreSQL memory, Azure AI Search, and LLM integration
|
||
- **Infrastructure Architecture**: Cloud deployment, security, and monitoring
|
||
|
||
#### Technical Documentation:
|
||
|
||
**System Layers Documented**:
|
||
```
|
||
- Frontend Layer: Next.js Web UI, Thread Components, Tool UIs
|
||
- API Gateway Layer: Next.js API Routes, Data Stream Protocol
|
||
- Backend Service Layer: FastAPI Server, AI SDK Adapter, SSE Controller
|
||
- Agent Orchestration Layer: LangGraph Workflow, Intent Recognition, Agents
|
||
- Memory Layer: PostgreSQL Session Store, Checkpointer, Memory Manager
|
||
- Retrieval Layer: Azure AI Search, Embedding Service, Search Indices
|
||
- LLM Layer: LLM Provider, Configuration Management
|
||
```
|
||
|
||
**Key Architectural Patterns**:
|
||
- **Multi-Intent Agent System**: Intent recognition and specialized agent routing
|
||
- **Two-Phase Retrieval**: Metadata discovery followed by content retrieval
|
||
- **Streaming Architecture**: Real-time SSE with tool progress tracking
|
||
- **Session Memory**: PostgreSQL-based persistent conversation history
|
||
- **Tool System**: Modular, composable retrieval and analysis tools
|
||
|
||
#### Benefits:
|
||
|
||
**For Development Team**:
|
||
- **Clear Architecture Understanding**: Complete system overview for new team members
|
||
- **Design Rationale**: Understanding of architectural decisions and trade-offs
|
||
- **Implementation Guidance**: Best practices and patterns for future development
|
||
- **Maintenance Support**: Clear documentation for troubleshooting and updates
|
||
|
||
**For System Architecture**:
|
||
- **Documentation Standards**: Establishes pattern for future architectural documentation
|
||
- **Design Consistency**: Ensures architectural decisions align with documented principles
|
||
- **Knowledge Preservation**: Captures institutional knowledge about system design
|
||
- **Future Planning**: Provides foundation for system evolution and enhancement
|
||
|
||
**For Operations**:
|
||
- **Deployment Understanding**: Clear view of production architecture and dependencies
|
||
- **Troubleshooting Guide**: Architectural context for debugging and issue resolution
|
||
- **Scaling Guidance**: Understanding of system scaling patterns and limitations
|
||
- **Security Overview**: Complete security architecture and implementation details
|
||
|
||
#### File Structure:
|
||
```
|
||
docs/
|
||
├── design.md # Comprehensive system design document (NEW)
|
||
├── CHANGELOG.md # This changelog with design documentation entry
|
||
├── deployment.md # Deployment-specific guidance
|
||
├── development.md # Development setup and guidelines
|
||
└── testing.md # Testing strategies and procedures
|
||
```
|
||
|
||
#### Next Steps:
|
||
- **Living Documentation**: Keep design document updated with system changes
|
||
- **Architecture Reviews**: Use document as reference for architectural decisions
|
||
- **Onboarding**: Include design document in new developer onboarding process
|
||
- **Documentation Standards**: Apply similar documentation patterns to other system aspects
|
||
|
||
---
|
||
|
||
## v1.2.6 - GPT-5 Model Integration and Prompt Template Refinement - Mon Sep 9 2025
|
||
|
||
### 🚀 **Major Update** *(Model Integration & Enhanced Agent Capabilities)*
|
||
|
||
**Integrated GPT-5 Chat model with refined prompt templates for improved reasoning and tool coordination.**
|
||
|
||
#### Changes Made:
|
||
|
||
**1. GPT-5 Model Integration**:
|
||
- **Model Upgrade**: Switched from GPT-4o to `gpt-5-chat` deployment
|
||
- **Azure Endpoint**: Updated to `aihubeus21512504059.cognitiveservices.azure.com`
|
||
- **API Version**: Upgraded to `2024-12-01-preview` for latest capabilities
|
||
- **Enhanced Reasoning**: Leveraging GPT-5's improved reasoning for complex multi-step retrieval
|
||
|
||
**2. Prompt Template Optimization for GPT-5**:
|
||
- **Tool Coordination**: Enhanced instructions for better parallel tool execution
|
||
- **Context Management**: Optimized for GPT-5's extended context handling capabilities
|
||
- **Reasoning Chain**: Improved workflow instructions leveraging advanced reasoning abilities
|
||
|
||
**3. Agent System Refinements**:
|
||
- **Phase Detection**: Better triggering conditions for Phase 2 document content retrieval
|
||
- **Query Rewriting**: Enhanced sub-query generation strategies optimized for GPT-5
|
||
- **Citation Accuracy**: Improved metadata tracking and source verification
|
||
|
||
#### Technical Implementation:
|
||
|
||
**Updated [`config.yaml`](config.yaml)**:
|
||
```yaml
|
||
azure:
|
||
base_url: https://aihubeus21512504059.cognitiveservices.azure.com/
|
||
api_key: 277a2631cf224647b2a56f311bd57741
|
||
api_version: 2024-12-01-preview
|
||
deployment: gpt-5-chat
|
||
```
|
||
|
||
**Enhanced [`llm_prompt.yaml`](llm_prompt.yaml)** - Phase 2 Triggers:
|
||
```yaml
|
||
# Phase 2: Document Content Detailed Retrieval
|
||
- **When to execute**: execute Phase 2 if the user asks about:
|
||
- "How to..." / "如何..." (procedures, methods, steps)
|
||
- Testing methods / 测试方法
|
||
- Requirements / 要求
|
||
- Technical details / 技术细节
|
||
- Implementation guidance / 实施指导
|
||
- Specific content within standards/regulations
|
||
```
|
||
|
||
**Tool Coordination Instructions**:
|
||
```yaml
|
||
# Parallel Retrieval Tool Call:
|
||
- Use each rewritten sub-query to call retrieval tools **in parallel**
|
||
- This maximizes coverage and ensures comprehensive information gathering
|
||
```
|
||
|
||
#### Key Features:
|
||
|
||
**GPT-5 Enhanced Capabilities**:
|
||
- **Advanced Reasoning**: Better understanding of complex technical queries
|
||
- **Improved Tool Coordination**: More efficient parallel tool execution planning
|
||
- **Enhanced Context Synthesis**: Better integration of multi-source information
|
||
- **Precise Citation Generation**: More accurate source tracking and reference mapping
|
||
|
||
**Optimized Retrieval Strategy**:
|
||
- **Smart Phase Detection**: GPT-5 better determines when detailed content retrieval is needed
|
||
- **Context-Aware Queries**: More sophisticated query rewriting based on conversation context
|
||
- **Cross-Reference Validation**: Enhanced ability to verify information across multiple sources
|
||
|
||
**Enhanced User Experience**:
|
||
- **Faster Response**: More efficient tool coordination reduces overall response time
|
||
- **Higher Accuracy**: Improved reasoning leads to more precise answers
|
||
- **Better Coverage**: Enhanced query strategies maximize information discovery
|
||
|
||
#### Performance Improvements:
|
||
- **Tool Efficiency**: Better parallel execution planning reduces redundant calls
|
||
- **Context Utilization**: Enhanced ability to maintain context across tool rounds
|
||
- **Quality Assurance**: Improved verification and synthesis of retrieved information
|
||
|
||
#### Migration Notes:
|
||
- **Seamless Upgrade**: No breaking changes to existing API or user interfaces
|
||
- **Backward Compatibility**: Existing conversation histories remain compatible
|
||
- **Enhanced Responses**: Users will notice improved response quality and accuracy
|
||
- **Tool Round Optimization**: GPT-5's reasoning works optimally with configured tool round limits
|
||
|
||
---
|
||
|
||
## v1.2.5 - Enhanced Multi-Phase Retrieval and Tool Round Optimization - Thu Sep 5 2025
|
||
|
||
### 🔧 **Enhancement** *(Agent System Prompt & Retrieval Strategy)*
|
||
|
||
**Optimized retrieval workflow with explicit parallel tool calling strategy and enhanced multi-language query coverage.**
|
||
|
||
#### Changes Made:
|
||
|
||
**1. Enhanced Multi-Phase Retrieval Strategy**:
|
||
- **Phase 1 - Metadata Discovery**: Added explicit "2-3 parallel rewritten queries" strategy for standards/regulations metadata discovery
|
||
- **Phase 2 - Document Content**: Refined detailed retrieval with "2-3 parallel rewritten queries with different content focus"
|
||
- **Cross-Language Coverage**: Mandatory inclusion of both Chinese and English query variants for comprehensive search coverage
|
||
|
||
**2. Parallel Tool Calling Optimization**:
|
||
- **Query Strategy Specification**: Clear guidance on generating 2-3 distinct parallel sub-queries per retrieval phase
|
||
- **Azure AI Search Optimization**: Enhanced for Hybrid Search (keyword + vector search) with specific terminology and synonyms
|
||
- **Tool Calling Efficiency**: Explicit instruction to execute rewritten sub-queries in parallel for maximum coverage
|
||
|
||
**3. Intent Classification Improvements**:
|
||
- **Standard_Regulation_RAG**: Enhanced examples covering content, scope, testing methods, and technical details
|
||
- **User_Manual_RAG**: Comprehensive coverage of CATOnline system usage, TRRC processes, and administrative functions
|
||
- **Clearer Boundaries**: Better distinction between technical content queries vs system usage queries
|
||
|
||
**4. User Manual Prompt Refinement**:
|
||
- **Evidence-Based Only**: Strengthened directive for 100% grounded responses from user manual content
|
||
- **Visual Integration**: Enhanced screenshot embedding requirements with strict formatting templates
|
||
- **Context Disambiguation**: Added role-based function differentiation (User vs Administrator)
|
||
|
||
#### Technical Implementation:
|
||
|
||
**Updated [`llm_prompt.yaml`](llm_prompt.yaml)** - Agent System Prompt:
|
||
```yaml
|
||
# Query Optimization & Parallel Retrieval Tool Calling
|
||
* Sub-queries Rewriting:
|
||
- Generate 2-3(mostly 2) distinct rewritten sub-queries
|
||
- If user's query is in Chinese, include 1 rewritten sub-query in English
|
||
- If user's query is in English, include 1 rewritten sub-query in Chinese
|
||
|
||
* Parallel Retrieval Tool Call:
|
||
- Use each rewritten sub-query to call retrieval tools **in parallel**
|
||
- This maximizes coverage and ensures comprehensive information gathering
|
||
```
|
||
|
||
**Enhanced Intent Classification**:
|
||
```yaml
|
||
# Standard_Regulation_RAG Examples:
|
||
- "What regulations relate to intelligent driving?"
|
||
- "How do you test the safety of electric vehicles?"
|
||
- "What are the main points of GB/T 34567-2023?"
|
||
|
||
# User_Manual_RAG Examples:
|
||
- What is CATOnline (the system)/TRRC/TRRC processes
|
||
- How to search for standards, regulations, TRRC news and deliverables
|
||
- User management, system configuration, administrative functionalities
|
||
```
|
||
|
||
**User Manual Prompt Template**:
|
||
```yaml
|
||
Step Template:
|
||
Step N: <Action / Instruction from manual>
|
||
(Optional short clarification from manual)
|
||
|
||

|
||
|
||
Notes: <business rules / warnings from manual>
|
||
```
|
||
|
||
#### Key Features:
|
||
|
||
**Multi-Phase Retrieval Workflow**:
|
||
- **Round 1**: Parallel metadata discovery with 2-3 optimized queries
|
||
- **Round 2**: Focused document content retrieval based on Round 1 insights
|
||
- **Round 3+**: Additional targeted retrieval for remaining gaps
|
||
|
||
**Cross-Language Query Strategy**:
|
||
- **Automatic Translation**: Chinese queries include English variants, English queries include Chinese variants
|
||
- **Terminology Optimization**: Technical terms, acronyms, and domain-specific language inclusion
|
||
- **Azure AI Search Enhancement**: Optimized for hybrid keyword + vector search capabilities
|
||
|
||
**Enhanced Citation System**:
|
||
- **Metadata Tracking**: Precise @tool_call_id and @order_num mapping
|
||
- **CSV Format**: Structured citations mapping in HTML comments
|
||
- **Source Verification**: Cross-referencing across multiple retrieval results
|
||
|
||
#### Benefits:
|
||
- **Coverage**: Parallel queries with cross-language variants maximize information discovery
|
||
- **Efficiency**: Strategic tool calling reduces unnecessary rounds while ensuring thoroughness
|
||
- **Accuracy**: Enhanced intent classification improves routing to appropriate RAG systems
|
||
- **User Experience**: Better visual integration in user manual responses with mandatory screenshots
|
||
- **Consistency**: Standardized formatting templates across all response types
|
||
|
||
#### Migration Notes:
|
||
- Enhanced prompt templates automatically improve response quality
|
||
- No breaking changes to existing API or user interfaces
|
||
- Cross-language query strategy improves search coverage for multilingual content
|
||
- Tool round limits (max_tool_rounds: 4, max_tool_rounds_user_manual: 2) work optimally with new parallel strategy
|
||
|
||
---
|
||
|
||
## v1.2.4 - Intent Classification Reference Consolidation - Wed Sep 4 2025
|
||
|
||
### 🔧 **Enhancement** *(Intent Classification Documentation)*
|
||
|
||
**Consolidated and enhanced UserManual intent classification examples by merging reference files.**
|
||
|
||
#### Changes Made:
|
||
- **Reference File Consolidation**: Merged UserManual examples from `intent-ref-1.txt` into `intent-ref-2.txt`
|
||
- **Enhanced Coverage**: Added more comprehensive use cases for UserManual intent classification
|
||
- **Improved Clarity**: Better organized examples to help with accurate intent recognition
|
||
|
||
#### Technical Implementation:
|
||
|
||
**Updated `.vibe/ref/intent-ref-2.txt`**:
|
||
- **Added from intent-ref-1.txt**:
|
||
- What is CATOnline (the system), TRRC, TRRC processes
|
||
- How to search for standards, regulations, TRRC news and deliverables in the system
|
||
- How to create and update standards, regulations and their documents
|
||
- How to download or export data
|
||
- How to do administrative functionalities
|
||
- Other questions about this (CatOnline) system's functions, or user guide
|
||
|
||
- **Preserved existing examples**:
|
||
- Questions directly about CatOnline functions or features
|
||
- TRRC-related processes/standards/regulations as implemented in CatOnline
|
||
- How to manage/search/download documents in the system
|
||
- User management or system configuration within CatOnline
|
||
- Use of admin features or data export in CatOnline
|
||
|
||
#### Categories Covered:
|
||
1. **System Introduction**: CATOnline system, TRRC concepts
|
||
2. **Search Functions**: Standards, regulations, TRRC news and deliverables search
|
||
3. **Document Management**: Create, update, manage, download documents
|
||
4. **System Configuration**: User management, system settings
|
||
5. **Administrative Functions**: Admin features, data export
|
||
6. **General Help**: System functions, user guides
|
||
|
||
#### Benefits:
|
||
- **Accuracy**: More comprehensive examples improve intent classification precision
|
||
- **Coverage**: Better coverage of UserManual use cases
|
||
- **Consistency**: Unified reference documentation for intent classification
|
||
- **Maintainability**: Single consolidated reference file easier to maintain
|
||
|
||
## v1.2.3 - User Manual Screenshot Format Clarification - Tue Sep 3 2025
|
||
|
||
### 🔧 **Enhancement** *(User Manual Prompt Refinement)*
|
||
|
||
**Added explicit clarification about UI screenshot embedding format in user manual responses.**
|
||
|
||
#### Changes Made:
|
||
- **Screenshot Format Guidance**: Added specific instruction about how UI screenshots should be embedded
|
||
- **Format Specification**: Clarified that operational UI screenshots are typically embedded in explanatory text using markdown image format
|
||
|
||
#### Technical Implementation:
|
||
|
||
**Updated `llm_prompt.yaml` - User Manual Prompt**:
|
||
```yaml
|
||
- **Visuals First**: ALWAYS include screenshots for explaining features or procedures. Every instructional step must be immediately followed by its screenshot on a new line.
|
||
- **Screenshot Format**: 操作步骤的相关UI截图通常会以markdown图片格式嵌入到说明文字中
|
||
```
|
||
|
||
#### Benefits:
|
||
- **Clarity**: AI assistant now has explicit guidance on screenshot embedding format
|
||
- **Consistency**: Ensures uniform approach to including UI screenshots in responses
|
||
- **User Experience**: Improves the formatting and presentation of instructional content
|
||
|
||
## v1.2.2 - Prompt Enhancement for Knowledge Boundary Control - Tue Sep 3 2025
|
||
|
||
### 🔧 **Enhancement** *(LLM Prompt Optimization)*
|
||
|
||
**Enhanced LLM prompts to strictly prevent model from outputting general knowledge when retrieval yields insufficient results.**
|
||
|
||
#### Problem Addressed:
|
||
- AI assistant was outputting model's built-in general knowledge about topics when specific information wasn't found in retrieval
|
||
- Users received generic information about systems/concepts instead of clear "information not available" responses
|
||
- Example: When asked about "CATOnline system", AI would provide general CAT (Computer-Assisted Testing) information from its training data
|
||
|
||
#### Solution Implemented:
|
||
- **Enhanced Agent System Prompt**: Added explicit "NO GENERAL KNOWLEDGE" directive
|
||
- **Enhanced User Manual Prompt**: Added similar strict knowledge boundary controls
|
||
- **Improved Fallback Messages**: Standardized response template for insufficient information scenarios
|
||
- **Multiple Reinforcement**: Added the restriction in multiple sections for emphasis
|
||
|
||
#### Technical Changes:
|
||
|
||
**Enhanced `llm_prompt.yaml`**:
|
||
- Added **"Critical: NO GENERAL KNOWLEDGE"** instruction in agent system prompt
|
||
- Enhanced fallback response template: "The system does not contain specific information about [specific topic/feature searched for]."
|
||
- Added similar controls in user manual prompt with template: "The user manual does not contain specific information about [specific topic/feature you searched for]."
|
||
- Reinforced the restriction in multiple workflow sections
|
||
|
||
#### Key Prompt Updates:
|
||
|
||
**Agent System Prompt**:
|
||
```yaml
|
||
* **Critical: NO GENERAL KNOWLEDGE**: If retrieval yields insufficient or no relevant results, **do not provide any general knowledge or assumptions**. Instead, clearly state "The system does not contain specific information about [specific topic/feature searched for]." and suggest how the user might reformulate their query.
|
||
```
|
||
|
||
**User Manual Prompt**:
|
||
```yaml
|
||
- **NO GENERAL KNOWLEDGE**: When retrieved content is insufficient, do NOT provide any general knowledge about systems, software, or common practices. State clearly: "The user manual does not contain specific information about [specific topic/feature you searched for]."
|
||
```
|
||
|
||
#### Benefits:
|
||
- **Accuracy**: Eliminates confusion from generic information
|
||
- **Transparency**: Users clearly understand when information is not available in the system
|
||
- **Trust**: Builds user confidence in system's knowledge boundaries
|
||
- **Guidance**: Provides clear direction for reformulating queries
|
||
|
||
#### Testing:
|
||
- Verified all prompt sections contain the new "NO GENERAL KNOWLEDGE" instructions
|
||
- Confirmed fallback message templates are properly implemented
|
||
- Tested that both agent and user manual prompts include the restrictions
|
||
|
||
## v1.2.1 - Retrieval Module Refactoring and Optimization - Mon Sep 2 2025
|
||
|
||
### 🔧 **Refactoring** *(Retrieval Module Structure Optimization)*
|
||
|
||
**Refactored retrieval module structure and optimized normalize_search_result function for better maintainability and performance.**
|
||
|
||
#### Key Changes:
|
||
- **File Renaming**: `service/retrieval/agentic_retrieval.py` → `service/retrieval/retrieval.py` for clearer naming
|
||
- **Function Optimization**: Simplified `normalize_search_result` by removing unnecessary `include_content` parameter
|
||
- **Logic Consolidation**: Moved result normalization to `search_azure_ai` method to eliminate redundancy
|
||
- **Import Updates**: Updated all references across the codebase to use the new module name
|
||
|
||
#### Technical Implementation:
|
||
- **Simplified normalize_search_result**:
|
||
- Removed `include_content` parameter (content is now always preserved)
|
||
- Function now focuses solely on cleaning search results and removing empty fields
|
||
- Eliminates the need for conditional content handling
|
||
|
||
- **Optimized Result Processing**:
|
||
- `normalize_search_result` is now called directly in `search_azure_ai` method
|
||
- Removed duplicate field removal logic between `search_azure_ai` and `normalize_search_result`
|
||
- Cleaner separation of concerns
|
||
|
||
- **Updated File References**:
|
||
- `service/graph/tools.py`
|
||
- `service/graph/user_manual_tools.py`
|
||
- `tests/unit/test_retrieval.py`
|
||
- `tests/unit/test_user_manual_tool.py`
|
||
- `tests/conftest.py`
|
||
- `scripts/debug_user_manual_retrieval.py`
|
||
- `scripts/final_verification.py`
|
||
|
||
#### Benefits:
|
||
- **Cleaner Code**: Eliminated redundant logic and simplified function signatures
|
||
- **Better Performance**: Single point of result normalization reduces processing overhead
|
||
- **Improved Maintainability**: Clearer module naming and consolidated logic
|
||
- **Consistent Behavior**: Content is always preserved, eliminating conditional handling complexity
|
||
|
||
#### Testing:
|
||
- Updated all test cases to match new function signatures
|
||
- Verified that all retrieval functionality works correctly
|
||
- Confirmed that result normalization properly removes unwanted fields while preserving content
|
||
|
||
## v1.2.0 - Azure AI Search Direct Integration - Wed Sep 2 2025
|
||
|
||
### ⚡ **Major Enhancement** *(Direct Azure AI Search Integration)*
|
||
|
||
**Replaced intermediate retrieval service with direct Azure AI Search REST API calls for improved performance and better control.**
|
||
|
||
#### Key Changes:
|
||
- **Direct Azure AI Search Integration**: Eliminated dependency on intermediate retrieval service, now calling Azure AI Search REST API directly
|
||
- **Hybrid Search with Semantic Ranking**: Implemented proper hybrid search combining text search + vector search with semantic ranking
|
||
- **Enhanced Result Processing**: Added automatic filtering by `@search.rerankerScore` threshold and `@order_num` field injection
|
||
- **Improved Configuration**: Extended config structure to support embedding service, API versions, and semantic configuration
|
||
|
||
#### Technical Implementation:
|
||
- **New Config Structure**: Added `EmbeddingConfig`, `IndexConfig` to support embedding generation and Azure Search parameters
|
||
- **Vector Query Support**: Implemented proper vector queries with field-specific targeting:
|
||
- `retrieve_standard_regulation`: `full_metadata_vector`
|
||
- `retrieve_doc_chunk_standard_regulation`: `contentVector,full_metadata_vector`
|
||
- `retrieve_doc_chunk_user_manual`: `contentVector`
|
||
- **Result Filtering**: Automatic removal of Azure Search metadata fields (`@search.score`, `@search.rerankerScore`, `@search.captions`)
|
||
- **Order Numbering**: Added `@order_num` field to track result ranking order
|
||
- **Score Threshold Filtering**: Filter results by reranker score threshold for quality control
|
||
|
||
#### Configuration Updates:
|
||
```yaml
|
||
retrieval:
|
||
endpoint: "https://search-endpoint.search.azure.cn"
|
||
api_key: "search-api-key"
|
||
api_version: "2024-11-01-preview"
|
||
semantic_configuration: "default"
|
||
embedding:
|
||
base_url: "http://embedding-service/v1-openai"
|
||
api_key: "embedding-api-key"
|
||
model: "qwen3-embedding-8b"
|
||
dimension: 4096
|
||
index:
|
||
standard_regulation_index: "index-name-1"
|
||
chunk_index: "index-name-2"
|
||
chunk_user_manual_index: "index-name-3"
|
||
```
|
||
|
||
#### Benefits:
|
||
- **Performance**: Eliminated intermediate service latency
|
||
- **Control**: Direct control over search parameters and result processing
|
||
- **Reliability**: Reduced dependencies and potential points of failure
|
||
- **Feature Support**: Full access to Azure AI Search capabilities including semantic ranking
|
||
|
||
#### Testing:
|
||
- Updated unit tests to work with new Azure AI Search implementation
|
||
- Verified hybrid search functionality with real Azure AI Search endpoints
|
||
- Confirmed proper result filtering and ordering
|
||
|
||
## v1.1.9 - Intent Recognition Structured Output Compatibility Fix - Mon Sep 2 2025
|
||
|
||
### 🔧 **Bug Fix** *(Intent Recognition Compatibility)*
|
||
|
||
**Fixed intent recognition error for models that don't support OpenAI's structured output format (json_schema).**
|
||
|
||
#### Problem Addressed:
|
||
- Intent recognition failed with error: "Invalid parameter: 'response_format' of type 'json_schema' is not supported with this model"
|
||
- DeepSeek and other non-OpenAI models don't support OpenAI's structured output feature
|
||
- System would default to Standard_Regulation_RAG but log errors continuously
|
||
|
||
#### Root Cause:
|
||
- `intent_recognition_node` used `llm_client.llm.with_structured_output(Intent)` which automatically adds `json_schema` response_format
|
||
- This feature is specific to OpenAI GPT models and not supported by DeepSeek, Claude, or other model providers
|
||
|
||
#### Solution:
|
||
- **Removed structured output dependency**: Replaced `with_structured_output()` with standard LLM calls
|
||
- **Enhanced text parsing**: Added robust response parsing to extract intent labels from text responses
|
||
- **Improved prompt engineering**: Added explicit output format instructions to system prompt
|
||
- **Enhanced error handling**: Better handling of different response content types (string/list)
|
||
|
||
#### Technical Changes:
|
||
|
||
**Modified**: `service/graph/intent_recognition.py`
|
||
```python
|
||
# Before (broken with non-OpenAI models):
|
||
intent_llm = llm_client.llm.with_structured_output(Intent)
|
||
intent_result = await intent_llm.ainvoke([SystemMessage(content=system_prompt)])
|
||
|
||
# After (compatible with all models):
|
||
system_prompt = intent_prompt_template.format(...) +
|
||
"\n\nIMPORTANT: You must respond with ONLY one of these two exact labels: " +
|
||
"'Standard_Regulation_RAG' or 'User_Manual_RAG'. Do not include any other text."
|
||
|
||
intent_result = await llm_client.llm.ainvoke([SystemMessage(content=system_prompt)])
|
||
|
||
# Enhanced response parsing
|
||
if isinstance(intent_result.content, str):
|
||
response_text = intent_result.content.strip()
|
||
elif isinstance(intent_result.content, list):
|
||
response_text = " ".join([str(item) for item in intent_result.content
|
||
if isinstance(item, str)]).strip()
|
||
```
|
||
|
||
#### Key Improvements:
|
||
|
||
**Model Compatibility**:
|
||
- Works with all LLM providers (OpenAI, Azure OpenAI, DeepSeek, Claude, etc.)
|
||
- No dependency on provider-specific features
|
||
- Maintains accuracy through enhanced prompt engineering
|
||
|
||
**Error Resolution**:
|
||
- Eliminated "json_schema not supported" errors
|
||
- Improved system reliability and user experience
|
||
- Maintained intent classification accuracy
|
||
|
||
**Robustness**:
|
||
- Better handling of different response formats
|
||
- Fallback mechanisms for unparseable responses
|
||
- Enhanced logging for debugging
|
||
|
||
#### Testing:
|
||
- ✅ Standard regulation queries correctly classified as `Standard_Regulation_RAG`
|
||
- ✅ User manual queries correctly classified as `User_Manual_RAG`
|
||
- ✅ Compatible with DeepSeek, Azure OpenAI, and other model providers
|
||
- ✅ No more structured output errors in logs
|
||
|
||
---
|
||
|
||
## v1.1.8 - User Manual Prompt Anti-Hallucination Enhancement - Sun Sep 1 2025
|
||
|
||
### 🧠 **Prompt Engineering Enhancement** *(User Manual Anti-Hallucination)*
|
||
|
||
**Enhanced the user_manual_prompt to reduce hallucinations by adopting grounded response principles from agent_system_prompt.**
|
||
|
||
#### Problem Addressed:
|
||
- User manual assistant could speculate about undocumented system features
|
||
- Inconsistent handling of missing information compared to main agent prompt
|
||
- Less structured approach to failing gracefully when manual information was insufficient
|
||
- Potential for inferring functionality not explicitly documented in user manuals
|
||
|
||
#### Solution:
|
||
- **Grounded Response Principles**: Adopted evidence-based response requirements from agent_system_prompt
|
||
- **Enhanced Fail-Safe Mechanisms**: Implemented comprehensive "No-Answer with Suggestions" framework
|
||
- **Explicit Anti-Speculation**: Added clear prohibitions against guessing or inferring undocumented features
|
||
- **Consistent Evidence Requirements**: Aligned with main agent prompt's evidence standards
|
||
|
||
#### Technical Changes:
|
||
|
||
**Modified**: `llm_prompt.yaml` - `user_manual_prompt`
|
||
```yaml
|
||
# Enhanced Core Directives
|
||
- **Answer with evidence** from retrieved user manual sources; avoid speculation.
|
||
Never guess or infer functionality not explicitly documented.
|
||
- **Fail gracefully**: if retrieval yields insufficient or no relevant results,
|
||
**do not guess**—produce a clear *No-Answer with Suggestions* section.
|
||
|
||
# Enhanced Workflow - Verify & Synthesize
|
||
- Cross-check all retrieved information for consistency.
|
||
- Only include information supported by retrieved user manual evidence.
|
||
- If evidence is insufficient, follow the *No-Answer with Suggestions* approach.
|
||
|
||
# Added No-Answer Framework
|
||
When retrieved user manual content is insufficient:
|
||
- State clearly what specific information is missing
|
||
- Do not guess or provide information not explicitly found
|
||
- Provide constructive next steps and alternative approaches
|
||
```
|
||
|
||
#### Key Improvements:
|
||
|
||
**Evidence Requirements**:
|
||
- Enhanced from basic "Evidence-Based Only" to comprehensive evidence validation
|
||
- Added explicit prohibition against speculation and inference
|
||
- Aligned with agent_system_prompt's grounded response standards
|
||
|
||
**Graceful Failure Handling**:
|
||
- Upgraded from simple "state it clearly" to structured "No-Answer with Suggestions"
|
||
- Provides specific guidance for reformulating queries
|
||
- Offers constructive next steps when information is missing
|
||
|
||
**Anti-Hallucination Measures**:
|
||
- ✅ Grounded responses principle
|
||
- ✅ No speculation directive
|
||
- ✅ Explicit no-guessing rule
|
||
- ✅ Evidence-only responses
|
||
- ✅ Constructive suggestions framework
|
||
|
||
#### Consistency Achievement:
|
||
- **Unified Approach**: Same evidence standards across agent_system_prompt and user_manual_prompt
|
||
- **Standardized Failure Handling**: Consistent "No-Answer with Suggestions" methodology
|
||
- **Preserved Specialization**: Maintained user manual specific features (screenshots, step-by-step format)
|
||
|
||
#### Files Added:
|
||
- `docs/topics/USER_MANUAL_PROMPT_ANTI_HALLUCINATION.md` - Detailed technical documentation
|
||
- `scripts/test_user_manual_prompt_improvements.py` - Comprehensive validation test suite
|
||
|
||
#### Expected Benefits:
|
||
- **Reduced Hallucinations**: No speculation about undocumented CATOnline features
|
||
- **Improved Reliability**: More accurate step-by-step instructions based only on manual content
|
||
- **Better User Guidance**: Structured suggestions when manual information is incomplete
|
||
- **System Consistency**: Unified anti-hallucination approach across all prompt types
|
||
|
||
---
|
||
|
||
## v1.1.7 - GPT-5 Mini Temperature Parameter Fix - Sun Sep 1 2025
|
||
|
||
### 🔧 **LLM Compatibility Fix** *(GPT-5 Mini Temperature Support)*
|
||
|
||
**Fixed temperature parameter handling to support GPT-5 mini model which only accepts default temperature values.**
|
||
|
||
#### Problem Solved:
|
||
- GPT-5 mini model rejected requests with explicit `temperature` parameter (e.g., 0.0, 0.2)
|
||
- Error: "Unsupported value: 'temperature' does not support 0.0 with this model. Only the default (1) value is supported."
|
||
- System always passed temperature even when commented out in configuration
|
||
|
||
#### Solution:
|
||
- **Conditional parameter passing**: Only include `temperature` in LLM requests when explicitly set in configuration
|
||
- **Optional configuration**: Changed temperature from required to optional in both new and legacy config classes
|
||
- **Model default usage**: When temperature not specified, model uses its own default value
|
||
|
||
#### Technical Changes:
|
||
|
||
**Modified**: `service/config.py`
|
||
```python
|
||
# Changed temperature from required to optional
|
||
class LLMParametersConfig(BaseModel):
|
||
temperature: Optional[float] = None # Was: float = 0
|
||
|
||
class LLMRagConfig(BaseModel):
|
||
temperature: Optional[float] = None # Was: float = 0.2
|
||
|
||
# Only include temperature in config when explicitly set
|
||
def get_llm_config(self) -> Dict[str, Any]:
|
||
if self.llm_prompt.parameters.temperature is not None:
|
||
base_config["temperature"] = self.llm_prompt.parameters.temperature
|
||
```
|
||
|
||
**Modified**: `service/llm_client.py`
|
||
```python
|
||
# Only pass temperature parameter when present in config
|
||
def _create_llm(self):
|
||
params = {
|
||
"base_url": llm_config["base_url"],
|
||
"api_key": llm_config["api_key"],
|
||
"model": llm_config["model"],
|
||
"streaming": True,
|
||
}
|
||
# Only add temperature if explicitly set
|
||
if "temperature" in llm_config:
|
||
params["temperature"] = llm_config["temperature"]
|
||
return ChatOpenAI(**params)
|
||
```
|
||
|
||
#### Configuration Examples:
|
||
|
||
**No Temperature (Uses Model Default)**:
|
||
```yaml
|
||
# llm_prompt.yaml
|
||
parameters:
|
||
# temperature: 0 # Commented out - model uses default
|
||
max_context_length: 100000
|
||
```
|
||
|
||
**Explicit Temperature**:
|
||
```yaml
|
||
# llm_prompt.yaml
|
||
parameters:
|
||
temperature: 0.7 # Will be passed to model
|
||
max_context_length: 100000
|
||
```
|
||
|
||
#### Backward Compatibility:
|
||
- ✅ Existing configurations continue to work
|
||
- ✅ Legacy `config.yaml` LLM settings still supported
|
||
- ✅ No breaking changes when temperature is explicitly set
|
||
|
||
#### Files Added:
|
||
- `docs/topics/GPT5_MINI_TEMPERATURE_FIX.md` - Detailed technical documentation
|
||
- `scripts/test_temperature_fix.py` - Comprehensive test suite
|
||
|
||
---
|
||
|
||
## v1.1.6 - Enhanced I18n Multi-Language Support - Sat Aug 31 2025
|
||
|
||
### 🌐 **Internationalization Enhancement** *(I18n Multi-Language Support)*
|
||
|
||
**Added comprehensive internationalization (i18n) support for Chinese and English languages across the web interface.**
|
||
|
||
---
|
||
|
||
## v1.1.5 - Aggressive Tool Call History Trimming - Sat Aug 31 2025
|
||
|
||
### 🚀 **Enhanced Token Optimization** *(Aggressive Trimming Strategy)*
|
||
|
||
**Modified trimming strategy to proactively clean historical tool call results regardless of token count, while protecting current conversation turn's tool calls.**
|
||
|
||
#### New Behavior:
|
||
- **Always trim when multiple tool rounds exist** - regardless of total token count
|
||
- **Preserve current conversation turn's tool calls** - never trim active tool execution results
|
||
- **Remove historical tool call results** - from previous conversation turns to minimize context pollution
|
||
|
||
#### Why This Change:
|
||
- Historical tool call results accumulate quickly in conversation history
|
||
- Large retrieval results consume significant tokens even when total context is manageable
|
||
- Proactive trimming prevents context bloat before hitting token limits
|
||
- Current tool calls must remain intact for proper agent workflow
|
||
|
||
#### Technical Implementation:
|
||
|
||
**Modified**: `service/graph/message_trimmer.py`
|
||
- **Enhanced `should_trim()`**: Now triggers when detecting multiple tool rounds (>1), not just on token limit
|
||
- **Preserved Strategy**: `_optimize_multi_round_tool_calls()` continues to keep only the most recent tool round
|
||
- **Current Turn Protection**: Agent workflow ensures current turn's tool calls are never trimmed during execution
|
||
|
||
#### Impact:
|
||
- **Proactive Cleanup**: Tool call history cleaned before reaching token limits
|
||
- **Context Quality**: Conversation stays focused on recent, relevant context
|
||
- **Workflow Protection**: Current tool execution results always preserved
|
||
- **Token Efficiency**: Maintains optimal token usage across conversation lifetime
|
||
|
||
---
|
||
|
||
## v1.1.4 - Multi-Round Tool Call Token Optimization - Sat Aug 31 2025
|
||
|
||
### 🚀 **Performance Enhancement** *(Token Optimization)*
|
||
|
||
**Implemented intelligent token optimization for multi-round tool calling scenarios to significantly reduce LLM context usage.**
|
||
|
||
#### Problem Solved:
|
||
- In multi-round tool calling scenarios, previous rounds' tool call results (ToolMessage) were consuming excessive tokens
|
||
- Large JSON responses from retrieval tools accumulated in conversation history
|
||
- Token usage could exceed LLM context limits, causing API failures
|
||
|
||
#### Key Features:
|
||
|
||
1. **Multi-Round Tool Call Detection**:
|
||
- Automatically identifies tool calling rounds in conversation history
|
||
- Recognizes patterns of AI messages with tool_calls followed by ToolMessage responses
|
||
|
||
2. **Intelligent Message Optimization**:
|
||
- Preserves system messages and original user queries
|
||
- Keeps only the most recent tool calling round for context continuity
|
||
- Removes older ToolMessage content that typically contains large response data
|
||
|
||
3. **Token Usage Reduction**:
|
||
- Achieves 60-80% reduction in token usage for multi-round scenarios
|
||
- Maintains conversation quality while respecting LLM context constraints
|
||
- Prevents API failures due to context length overflow
|
||
|
||
#### Technical Implementation:
|
||
|
||
- **File**: `service/graph/message_trimmer.py`
|
||
- **New Methods**:
|
||
- `_optimize_multi_round_tool_calls()` - Core optimization logic
|
||
- `_identify_tool_rounds()` - Tool round pattern recognition
|
||
- Enhanced `trim_conversation_history()` - Integrated optimization workflow
|
||
|
||
#### Test Results:
|
||
- **Message Reduction**: 60% fewer messages in multi-round scenarios
|
||
- **Token Savings**: 70-80% reduction in token consumption
|
||
- **Context Preservation**: Maintains conversation flow and quality
|
||
|
||
#### Configuration:
|
||
```yaml
|
||
parameters:
|
||
max_context_length: 96000 # Configurable context length
|
||
# Optimization automatically applies when multiple tool rounds detected
|
||
```
|
||
|
||
#### Benefits:
|
||
- **Cost Efficiency**: Significant reduction in LLM API costs
|
||
- **Reliability**: Prevents context overflow errors
|
||
- **Performance**: Faster processing with smaller context windows
|
||
- **Scalability**: Supports longer multi-round conversations
|
||
|
||
#### Files Modified:
|
||
- `service/graph/message_trimmer.py`
|
||
- `tests/unit/test_message_trimmer.py`
|
||
- `docs/topics/MULTI_ROUND_TOKEN_OPTIMIZATION.md`
|
||
- `docs/CHANGELOG.md`
|
||
|
||
---
|
||
|
||
## v1.1.3 - UI Text Update - Fri Aug 30 2025
|
||
|
||
### ✏️ **Content Update** *(UI Improvement)*
|
||
|
||
**Updated the example questions in the frontend UI.**
|
||
|
||
#### Changes Made:
|
||
|
||
- Modified the third and fourth example questions in both Chinese and English in `web/src/utils/i18n.ts` to be more relevant to user needs.
|
||
- **Chinese**:
|
||
- `根据标准,如何测试电动汽车充电功能的兼容性`
|
||
- `如何注册申请CATOnline权限?`
|
||
- **English**:
|
||
- `According to the standard, how to test the compatibility of electric vehicle charging function?`
|
||
- `How to register for CATOnline access?`
|
||
|
||
#### Benefits:
|
||
|
||
- Provides users with more practical and common question examples.
|
||
- Improves user experience by guiding them to ask more effective questions.
|
||
|
||
#### Files Modified:
|
||
- `web/src/utils/i18n.ts`
|
||
- `docs/CHANGELOG.md`
|
||
|
||
## v1.1.2 - Prompt Optimization - Fri Aug 30 2025
|
||
|
||
### 🚀 **Prompt Optimization** *(Prompt Engineering)*
|
||
|
||
**Optimized and compressed `intent_recognition_prompt` and `user_manual_prompt` in `llm_prompt.yaml`.**
|
||
|
||
#### Changes Made:
|
||
|
||
1. **`intent_recognition_prompt`**:
|
||
* Condensed background information into key bullet points.
|
||
* Refined classification descriptions for clarity.
|
||
* Simplified classification guidelines with keyword hints for better decision-making.
|
||
|
||
2. **`user_manual_prompt`**:
|
||
* Elevated key instructions to **Core Directives** for emphasis.
|
||
* Streamlined the workflow description.
|
||
* Made the **Response Formatting** rules more stringent, especially regarding screenshots.
|
||
* Retained the crucial **Context Disambiguation** section.
|
||
|
||
#### Benefits:
|
||
|
||
- **Efficiency**: More compact prompts for faster processing.
|
||
- **Reliability**: Clearer and more direct instructions reduce the likelihood of incorrect outputs.
|
||
- **Maintainability**: Improved structure makes the prompts easier to read and update.
|
||
|
||
#### Files Modified:
|
||
- `llm_prompt.yaml`
|
||
- `docs/CHANGELOG.md`
|
||
|
||
## v1.1.1 - User Manual Tool Rounds Configuration - Fri Aug 29 2025
|
||
|
||
### 🔧 **Configuration Enhancement** *(Configuration Update)*
|
||
|
||
**Added Independent Tool Rounds Configuration for User Manual RAG**
|
||
|
||
#### Changes Made:
|
||
|
||
1. **Configuration Structure**
|
||
- Added `max_tool_rounds_user_manual: 3` to `config.yaml`
|
||
- Separated user manual agent tool rounds from main agent configuration
|
||
- Maintained backward compatibility with existing configuration
|
||
|
||
2. **Code Updates**
|
||
- Updated `AppConfig` class in `service/config.py` to include `max_tool_rounds_user_manual` field
|
||
- Added `max_tool_rounds_user_manual` to `AgentState` in `service/graph/state.py`
|
||
- Modified `service/graph/user_manual_rag.py` to use separate configuration
|
||
- Updated graph initialization in `service/graph/graph.py` to include new config
|
||
|
||
3. **Prompt System Updates**
|
||
- Updated `user_manual_prompt` in `llm_prompt.yaml`:
|
||
- Removed citation-related instructions (no [1] citations or citation mapping)
|
||
- Set all rewritten queries to use English language
|
||
- Streamlined response format without citation requirements
|
||
|
||
#### Technical Details:
|
||
|
||
- **Configuration Priority**: State-level config takes precedence over file config
|
||
- **Independent Configuration**: User manual agent now has its own `max_tool_rounds_user_manual` setting
|
||
- **Default Values**: Both main agent (3 rounds) and user manual agent (3 rounds) use same default
|
||
- **Validation**: All syntax checks and configuration loading tests passed
|
||
|
||
#### Benefits:
|
||
|
||
- **Flexibility**: Different tool round limits for different agent types
|
||
- **Maintainability**: Clear separation of concerns between agent configurations
|
||
- **Consistency**: Follows same configuration pattern as main agent
|
||
- **Customization**: Allows fine-tuning user manual agent behavior independently
|
||
|
||
#### Files Modified:
|
||
- `config.yaml`
|
||
- `service/config.py`
|
||
- `service/graph/state.py`
|
||
- `service/graph/graph.py`
|
||
- `service/graph/user_manual_rag.py`
|
||
- `llm_prompt.yaml`
|
||
|
||
## v1.1.0 User Manual Agent Update Summary - Fri Aug 29 22:20:20 HKT 2025
|
||
|
||
## ✅ Successfully Completed
|
||
|
||
1. **Prompt Configuration Update**
|
||
- Updated `user_manual_prompt` in `llm_prompt.yaml`
|
||
- Integrated query optimization, parallel retrieval, and evidence-based answering from `agent_system_prompt`
|
||
- Verified prompt loading with test script (6566 chars)
|
||
|
||
2. **Agent Node Logic**
|
||
- User manual agent node is autonomous with multi-round tool calls (3 rounds max)
|
||
- Intent classification correctly routes to User_Manual_RAG
|
||
- Agent node redirects to user_manual_agent_node correctly
|
||
|
||
3. **Multi-Round Tool Execution**
|
||
- Successfully executes multiple tool rounds
|
||
- Tool calls increment properly (1/3, 2/3, 3/3)
|
||
- Max rounds protection works (forces final synthesis)
|
||
|
||
## 🚨 Issues Discovered
|
||
|
||
1. **Citation Number Error**:
|
||
- Error: "AgentWorkflow error: 'citation number'"
|
||
- Occurring during user manual agent execution
|
||
|
||
2. **SSE Streaming Issue**:
|
||
- TypeError: 'coroutine' object is not iterable
|
||
- Affecting streaming response delivery
|
||
- StreamingResponse configuration needs fixing
|
||
|
||
## 📊 Test Results
|
||
|
||
- ✅ Prompt configuration test: PASSED
|
||
- ✅ Intent recognition: PASSED
|
||
- ✅ Agent routing: PASSED
|
||
- ✅ Multi-round tool calls: PASSED
|
||
- ❌ Citation processing: FAILED
|
||
- ❌ SSE streaming: FAILED
|
||
|
||
## 🔍 Next Steps
|
||
|
||
1. Fix citation number error in user manual agent
|
||
2. Fix SSE streaming response format
|
||
3. Complete end-to-end validation
|
||
|
||
## v1.0.9 - 2025-08-29 🤖
|
||
|
||
### 🤖 **User Manual Agent Transformation** *(Major Feature Enhancement)*
|
||
|
||
#### **🔄 Autonomous User Manual Agent Implementation** *(Architecture Upgrade)*
|
||
- **Agent Node Conversion**: Transformed `service/graph/user_manual_rag.py` from simple RAG to autonomous agent
|
||
- **Detect-First-Then-Stream Strategy**: Implemented optimal multi-round behavior with tool detection and streaming synthesis
|
||
- **Tool Round Management**: Added intelligent tool calling with configurable round limits and state tracking
|
||
- **Conversation Trimming**: Integrated automatic context length management for long conversations
|
||
- **Streaming Support**: Enhanced real-time response generation with HTML comment filtering
|
||
- **User Manual Tool Integration**: Specialized tool ecosystem for user manual operations
|
||
- **Tool Schema Generation**: Automatic schema generation from `service/graph/user_manual_tools.py`
|
||
- **Force Tool Choice**: Enabled autonomous tool selection for optimal response generation
|
||
- **Tool Execution Pipeline**: Parallel-capable tool execution with streaming events and error handling
|
||
- **Routing Logic Enhancement**: Sophisticated routing system for multi-round workflows
|
||
- **Smart Routing**: Routes between `user_manual_tools`, `user_manual_agent`, and `post_process`
|
||
- **State-Aware Decisions**: Context-aware routing based on tool calls and conversation state
|
||
- **Final Synthesis Detection**: Automatic transition to synthesis mode when appropriate
|
||
- **Error Handling & Recovery**: Comprehensive error management system
|
||
- **Graceful Degradation**: User-friendly error messages with proper error categorization
|
||
- **Stream Error Events**: Real-time error notification through streaming interface
|
||
- **Tool Error Recovery**: Resilient tool execution with fallback mechanisms
|
||
|
||
#### **🔧 Technical Implementation Details** *(System Architecture)*
|
||
- **Function Signatures**: New agent functions following established patterns from main agent
|
||
- `user_manual_agent_node()`: Main autonomous agent function
|
||
- `user_manual_should_continue()`: Intelligent routing logic
|
||
- `run_user_manual_tools_with_streaming()`: Enhanced tool execution
|
||
- **Configuration Integration**: Seamless integration with existing configuration system
|
||
- **Prompt Template Usage**: Uses existing `user_manual_prompt` from `llm_prompt.yaml`
|
||
- **Dynamic Prompt Formatting**: Contextual prompt generation with conversation history and retrieved content
|
||
- **Tool Configuration**: Automatic tool binding and schema management
|
||
- **Backward Compatibility**: Maintained legacy function for seamless transition
|
||
- **Legacy Wrapper**: `user_manual_rag_node()` redirects to new agent implementation
|
||
- **API Consistency**: No breaking changes to existing interfaces
|
||
- **Migration Path**: Smooth upgrade path for existing implementations
|
||
|
||
#### **✅ Testing & Validation** *(Quality Assurance)*
|
||
- **Comprehensive Test Suite**: New test script `scripts/test_user_manual_agent.py`
|
||
- **Basic Agent Testing**: Tool detection, calling, and routing validation
|
||
- **Integration Workflow Testing**: Complete multi-round conversation scenarios
|
||
- **Error Handling Testing**: Graceful error recovery and user feedback
|
||
- **Performance Validation**: Streaming response and tool execution timing
|
||
- **Functionality Validation**: All core features tested and validated
|
||
- ✅ Tool detection and autonomous calling
|
||
- ✅ Multi-round workflow execution
|
||
- ✅ Streaming response generation
|
||
- ✅ Error handling and recovery
|
||
- ✅ State management and routing logic
|
||
|
||
#### **📚 Documentation & Examples** *(Knowledge Management)*
|
||
- **Implementation Guide**: Comprehensive documentation in `docs/topics/USER_MANUAL_AGENT_IMPLEMENTATION.md`
|
||
- **Usage Examples**: Practical code examples and implementation patterns
|
||
- **Architecture Overview**: Technical details and design decisions
|
||
- **Migration Guide**: Step-by-step upgrade instructions
|
||
|
||
**Impact**: Transforms user manual functionality from simple retrieval to intelligent autonomous agent capable of multi-round conversations, tool usage, and sophisticated response generation while maintaining full backward compatibility.
|
||
|
||
## v1.0.8 - 2025-08-29 📚
|
||
|
||
### 📚 **User Manual Prompt Enhancement** *(Functional Improvement)*
|
||
|
||
#### **🎯 Enhanced User Manual Assistant Prompt** *(Content Update)*
|
||
- **Context Disambiguation Rules**: Added comprehensive disambiguation guidelines for overlapping concepts
|
||
- **Function Distinction**: Clear separation between Homepage functions (User) vs Admin Console functions (Administrator)
|
||
- **Management Clarity**: Differentiated between user management vs user group management operations
|
||
- **Role-based Operations**: Defined default roles for different operations (view/search for Users, edit/delete/configure for Administrators)
|
||
- **Clarification Protocol**: Added requirement to ask for clarification when user context is unclear
|
||
- **Response Structure Standards**: Implemented standardized response formatting
|
||
- **Step-by-Step Instructions**: Mandated complete procedural guidance with figures
|
||
- **Structured Format**: Required specific format for each step (description, screenshot, additional notes)
|
||
- **Business Rules Integration**: Ensured inclusion of all relevant business rules from source sections
|
||
- **Documentation Structure**: Maintained original documentation hierarchy and organization
|
||
- **Content Reproduction Rules**: Established strict content fidelity guidelines
|
||
- **Exact Wording**: Required copying exact wording and sequence from source sections
|
||
- **Complete Information**: Mandated inclusion of ALL information without summarization
|
||
- **Format Preservation**: Maintained original formatting and hierarchical structure
|
||
- **No Reorganization**: Prohibited modification or reorganization of original content
|
||
- **Reference Integration**: Successfully merged guidance from `.vibe/ref/user_manual_prompt-ref.txt`
|
||
- **Quality Assurance**: Enhanced accuracy and completeness of user manual responses
|
||
|
||
#### **📋 Reference File Analysis** *(Content Optimization)*
|
||
- **catonline-ref.txt Assessment**: Evaluated system background reference content
|
||
- **Content Alignment**: Confirmed existing content already covers CATOnline system background
|
||
- **Redundancy Avoidance**: Decided against merging to prevent duplicate instructions
|
||
- **Content Validation**: Verified accuracy and completeness of existing background information
|
||
- **user_manual_prompt-ref.txt Integration**: Successfully incorporated valuable operational guidelines
|
||
- **Value Assessment**: Identified high-value content missing from existing prompt
|
||
- **Strategic Merge**: Integrated content to enhance response quality without duplication
|
||
- **Instruction Optimization**: Improved prompt effectiveness while maintaining conciseness
|
||
|
||
## v1.0.7 - 2025-08-29 🎯
|
||
|
||
### 🎯 **Intent Recognition Enhancement** *(Functional Improvement)*
|
||
|
||
#### **📝 Enhanced Intent Classification Prompt** *(Content Update)*
|
||
- **Detailed Guidelines**: Added comprehensive classification criteria based on reference files
|
||
- **Content vs System Operation**: Clear distinction between standard/regulation content queries and CATOnline system operation queries
|
||
- **Standard_Regulation_RAG Examples**:
|
||
- "What regulations relate to intelligent driving?"
|
||
- "How do you test the safety of electric vehicles?"
|
||
- "What are the main points of GB/T 34567-2023?"
|
||
- "What is the scope of ISO 26262?"
|
||
- **User_Manual_RAG Examples**:
|
||
- "What is CATOnline (the system)?"
|
||
- "How to do search for standards, regulations, TRRC news and deliverables?"
|
||
- "How to create and update standards, regulations and their documents?"
|
||
- "How to download or export data?"
|
||
- **Classification Guidelines**: Added specific rules for edge cases and ambiguous queries
|
||
- **Reference Integration**: Incorporated guidance from `.vibe/ref/intent-ref-1.txt` and `.vibe/ref/intent-ref-2.txt`
|
||
|
||
#### **🏢 CATOnline Background Information Integration** *(Context Enhancement)*
|
||
- **Background Context**: Added comprehensive CATOnline system background information to intent recognition prompt
|
||
- **System Definition**: Integrated explanation that CATOnline is the China Automotive Technical Regulatory Online System
|
||
- **Feature Coverage**: Included details about CATOnline capabilities:
|
||
- TRRC process introductions and business areas
|
||
- Standards/laws/regulations/protocols search and viewing
|
||
- Document download and Excel export functionality
|
||
- Consumer test and voluntary certification checking
|
||
- Deliverable reminders and TRRC deliverable retrieval
|
||
- Admin features: popup configuration, working groups management, standards/regulations CRUD operations
|
||
- **TRRC Context**: Added clarification that TRRC stands for Technical Regulation Region China of Volkswagen
|
||
- **Enhanced Classification**: Background information helps improve intent classification accuracy for CATOnline-specific queries
|
||
|
||
#### **🧪 Testing & Validation** *(Quality Assurance)*
|
||
- **Intent Recognition Tests**: Verified enhanced prompt with multiple test scenarios
|
||
- **Multi-Intent Workflow**: Validated proper routing between Standard_Regulation_RAG and User_Manual_RAG
|
||
- **Edge Case Handling**: Tested classification accuracy for ambiguous queries
|
||
- **TRRC Edge Case**: Added specific handling for TRRC-related queries to distinguish between content vs. system operation
|
||
- **CATOnline Background Tests**: Created comprehensive test suite for CATOnline-specific scenarios
|
||
- **100% Accuracy**: Maintained perfect classification accuracy on all test suites including background-enhanced scenarios
|
||
|
||
## v1.0.6 - 2025-08-28 🔧
|
||
|
||
### 🔧 **Code Architecture Refactoring & Optimization** *(Technical Improvement)*
|
||
|
||
#### **🧹 Code Structure Cleanup** *(Breaking Fix)*
|
||
- **Duplicate State Removal**: Eliminated duplicate `AgentState` definitions across modules
|
||
- **Unified Definition**: Consolidated all state management to `/service/graph/state.py`
|
||
- **Import Cleanup**: Removed redundant AgentState from `graph.py`
|
||
- **Type Safety**: Ensured consistent state typing across all graph nodes
|
||
- **Circular Import Resolution**: Fixed circular dependency issues in module imports
|
||
- **Clean Dependencies**: Streamlined import statements and removed unused context variables
|
||
|
||
#### **📁 Module Separation & Organization** *(Code Organization)*
|
||
- **Intent Recognition Module**: Moved `intent_recognition_node` to dedicated `/service/graph/intent_recognition.py`
|
||
- **Pure Function**: Self-contained intent classification logic
|
||
- **LLM Integration**: Structured output with Pydantic Intent model
|
||
- **Context Handling**: Intelligent conversation history rendering
|
||
- **User Manual RAG Module**: Extracted `user_manual_rag_node` to `/service/graph/user_manual_rag.py`
|
||
- **Specialized Processing**: Dedicated user manual query handling
|
||
- **Tool Integration**: Direct integration with user manual retrieval tools
|
||
- **Stream Support**: Complete SSE streaming capabilities
|
||
- **Graph Simplification**: Cleaned up main `graph.py` by removing redundant code
|
||
|
||
#### **⚙️ Configuration Enhancement** *(Configuration)*
|
||
- **Prompt Externalization**: Moved all hardcoded prompts to `llm_prompt.yaml`
|
||
- **Intent Recognition Prompt**: Configurable intent classification instructions
|
||
- **User Manual Prompt**: Configurable user manual response template
|
||
- **Agent System Prompt**: Existing agent behavior remains configurable
|
||
- **Runtime Configuration**: All prompts now loaded dynamically from config file
|
||
- **Deployment Flexibility**: Different environments can use different prompt configurations
|
||
|
||
#### **🧪 Testing & Validation** *(Quality Assurance)*
|
||
- **Graph Compilation Tests**: Verified successful compilation after refactoring
|
||
- **Multi-Intent Workflow Tests**: End-to-end validation of both intent pathways
|
||
- **Module Integration Tests**: Confirmed proper module separation and imports
|
||
- **Configuration Loading Tests**: Validated dynamic prompt loading from config files
|
||
|
||
#### **📋 Technical Details**
|
||
- **Files Modified**:
|
||
- `/service/graph/graph.py` - Removed duplicate definitions, clean imports
|
||
- `/service/graph/state.py` - Single source of truth for AgentState
|
||
- `/service/graph/intent_recognition.py` - New dedicated module
|
||
- `/service/graph/user_manual_rag.py` - New dedicated module
|
||
- `/llm_prompt.yaml` - Added configurable prompts
|
||
- **Import Chain**: Fixed circular imports between graph nodes
|
||
- **Type Safety**: Consistent `AgentState` usage across all modules
|
||
- **Testing**: 100% pass rate on graph compilation and workflow tests
|
||
|
||
#### **🚀 Developer Experience**
|
||
- **Code Maintainability**: Better separation of concerns and module boundaries
|
||
- **Configuration Management**: Centralized prompt management for easier tuning
|
||
- **Debug Support**: Cleaner stack traces with resolved circular imports
|
||
- **Extension Ready**: Easier to add new intent types or modify existing behavior
|
||
|
||
#### **<2A> Internationalization & UX Improvements** *(User Experience)*
|
||
- **English Prompts**: Updated intent recognition prompts to use English for improved LLM classification accuracy
|
||
- **English User Manual Prompts**: Updated user manual RAG prompts to use English for consistency
|
||
- **Error Messages**: Converted all error messages to English for consistency
|
||
- **No Default Prompts**: Removed hardcoded fallback prompts, ensuring explicit configuration management
|
||
- **Enhanced Conversation Rendering**: Updated conversation history format to use `<user>...</user>` and `<ai>...</ai>` tags for better LLM parsing
|
||
- **Configuration Integration**: Added `intent_recognition_prompt` and `user_manual_prompt` to configuration loading system
|
||
|
||
#### **<2A>🎨 UI/UX Improvements** *(User Interface)*
|
||
- **Tool Icon Enhancement**: Updated `retrieve_system_usermanual` tool icon to `user-guide.png`
|
||
- **Visual Distinction**: Better visual differentiation between standard regulation and user manual tools
|
||
- **User Experience**: More intuitive icon representing user manual/guide functionality
|
||
- **Icon Asset**: Leveraged existing `user-guide.png` icon from public assets
|
||
|
||
## v1.0.5 - 2025-08-28 🎯
|
||
|
||
### 🎯 **Multi-Intent RAG System Implementation** *(Major Feature)*
|
||
|
||
#### **🧠 Intent Recognition Engine** *(New)*
|
||
- **Intent Classification**: LLM-powered intelligent intent recognition with context awareness
|
||
- **Supported Intents**:
|
||
- `Standard_Regulation_RAG`: Manufacturing standards, regulations, and compliance queries
|
||
- `User_Manual_RAG`: CATOnline system usage, features, and operational guidance
|
||
- **Technology**: Structured output with Pydantic models for reliable classification
|
||
- **Accuracy**: 100% classification accuracy in testing across Chinese and English queries
|
||
- **Context Awareness**: Leverages conversation history for improved intent disambiguation
|
||
|
||
#### **🔄 Enhanced Workflow Architecture** *(Breaking Change)*
|
||
- **New Graph Structure**: `START → intent_recognition → [conditional_routing] → {Standard_RAG | User_Manual_RAG}`
|
||
- **Entry Point Change**: All queries now start with intent recognition instead of direct agent processing
|
||
- **Dual Processing Paths**:
|
||
- **Standard_Regulation_RAG**: Multi-round agent workflow with tool orchestration (existing behavior)
|
||
- **User_Manual_RAG**: Single-round specialized processing with user manual retrieval
|
||
- **Backward Compatibility**: Existing standard/regulation queries maintain full functionality
|
||
|
||
#### **📚 User Manual RAG Specialization** *(New)*
|
||
- **Dedicated Node**: `user_manual_rag_node` for specialized user manual processing
|
||
- **Tool Integration**: Direct integration with `retrieve_system_usermanual` tool
|
||
- **Response Template**: Professional user manual assistance with structured guidance
|
||
- **Streaming Support**: Real-time token streaming for immediate user feedback
|
||
- **Error Handling**: Graceful degradation with support contact suggestions
|
||
|
||
#### **🏗️ Technical Architecture Improvements**
|
||
- **State Management**: Enhanced `AgentState` with `intent` field for workflow routing
|
||
- **Modular Design**: Separated user manual tools into dedicated module (`user_manual_tools.py`)
|
||
- **Type Safety**: Full TypeScript-style type annotations with Literal types for intent routing
|
||
- **Memory Persistence**: Both intent paths support PostgreSQL session memory and conversation history
|
||
- **Testing Suite**: Comprehensive test coverage including intent recognition and end-to-end workflow validation
|
||
|
||
#### **🚀 Performance & Reliability**
|
||
- **Smart Routing**: Eliminates unnecessary tool calls for user manual queries
|
||
- **Optimized Flow**: Single-round processing for user manual queries vs multi-round for standards
|
||
- **Error Recovery**: Intent recognition failure gracefully defaults to standard regulation processing
|
||
- **Session Management**: Complete session persistence across both intent pathways
|
||
|
||
#### **📋 Query Classification Examples**
|
||
**Standard_Regulation_RAG Path**:
|
||
- "请问GB/T 18488标准的具体内容是什么?"
|
||
- "ISO 26262 functional safety standard requirements"
|
||
- "汽车安全法规相关规定"
|
||
|
||
**User_Manual_RAG Path**:
|
||
- "如何使用CATOnline系统进行搜索?"
|
||
- "How do I log into the CATOnline system?"
|
||
- "CATOnline系统的用户管理功能怎么使用?"
|
||
|
||
#### **🔧 Implementation Files**
|
||
- **Core Logic**: Enhanced `service/graph/graph.py` with intent nodes and routing
|
||
- **Intent Recognition**: `intent_recognition_node()` function with LLM classification
|
||
- **User Manual Processing**: `user_manual_rag_node()` function with specialized handling
|
||
- **State Management**: Updated `service/graph/state.py` with intent support
|
||
- **Tool Organization**: New `service/graph/user_manual_tools.py` module
|
||
- **Documentation**: Comprehensive implementation guide in `docs/topics/MULTI_INTENT_IMPLEMENTATION.md`
|
||
|
||
#### **📈 Impact**
|
||
- **User Experience**: Intelligent query routing for more relevant responses
|
||
- **System Efficiency**: Optimized processing paths based on query type
|
||
- **Extensibility**: Framework ready for additional intent types
|
||
- **Maintainability**: Clear separation of concerns between different query domains
|
||
|
||
---
|
||
|
||
## v1.0.4 - 2025-08-27 🔧
|
||
|
||
### 🔧 **New Tool Implementation**
|
||
|
||
#### **📚 System User Manual Retrieval Tool** *(New)*
|
||
- **Tool Name**: `retrieve_system_usermanual`
|
||
- **Purpose**: Search for document content chunks of user manual of this system (CATOnline)
|
||
- **Integration**: Full LangGraph integration with @tool decorator pattern
|
||
- **UI Support**: Complete frontend integration with multilingual UI labels
|
||
- Chinese: "系统使用手册检索"
|
||
- English: "System User Manual Retrieval"
|
||
- **Configuration**: Added `chunk_user_manual_index` support in SearchConfig
|
||
- **Error Handling**: Robust error handling with proper logging and fallback responses
|
||
- **Testing**: Comprehensive unit tests for tool structure and integration validation
|
||
|
||
#### **🎯 Technical Implementation Details**
|
||
- **Backend**: Added to `service/graph/tools.py` following LangGraph best practices
|
||
- **Frontend**: Integrated into `web/src/components/ToolUIs.tsx` with consistent styling
|
||
- **Translation**: Updated `web/src/utils/i18n.ts` with bilingual support
|
||
- **Configuration**: Enhanced `service/config.py` with user manual index configuration
|
||
- **Tool Registration**: Automatically included in tools list and schema generation
|
||
|
||
#### **📝 Note**
|
||
The search index `index-cat-usermanual-chunk-prd` referenced in the configuration is not yet available, but the tool framework is fully implemented and ready for use once the index is created.
|
||
|
||
## v1.0.3 - 2025-08-26 ✨
|
||
|
||
### ✨ **UI Enhancements & Example Questions**
|
||
|
||
#### **📱 Latest CSS Improvements** *(Just Updated)*
|
||
- **Enhanced Example Question Layout**: Increased min-width to 360px and max-width to 450px for better readability
|
||
- **Perfect Centering**: Added `justify-items: center` for professional grid alignment
|
||
- **Improved Spacing**: Enhanced padding and gap values for optimal visual hierarchy
|
||
- **Mobile Optimization**: Consistent responsive design with improved touch targets on mobile devices
|
||
|
||
#### **🎯 Welcome Page Example Questions**
|
||
- **Multilingual Support**: Added 4 interactive example questions with Chinese/English translations
|
||
- **Smart Interaction**: Click-to-send functionality using `useComposerRuntime()` hook for seamless assistant-ui integration
|
||
- **Responsive Design**: Auto-adjusting grid layout (2x2 on desktop, single column on mobile)
|
||
- **Professional Styling**: Card-based design with hover effects, shadows, and smooth animations
|
||
|
||
#### **🌐 Updated Branding & Messaging**
|
||
- **App Title**: Updated to "CATOnline AI助手" / "CATOnline AI Assistant"
|
||
- **Enhanced Descriptions**: Comprehensive service descriptions highlighting CATOnline semantic search capabilities
|
||
- **Detailed Welcome Messages**: Multi-paragraph welcome text explaining current service scope and upcoming features
|
||
- **Consistent Multilingual Content**: Perfect alignment between Chinese and English versions
|
||
|
||
#### **📝 Example Questions Added**
|
||
**Chinese**:
|
||
1. 电力储能用锂离子电池最新标准发布时间?
|
||
2. 如何测试电动汽车的充电性能?
|
||
3. 提供关于车辆通讯安全的法规
|
||
4. 自动驾驶L2和L3的定义
|
||
|
||
**English**:
|
||
1. When was the latest standard for lithium-ion batteries for power storage released?
|
||
2. How to test electric vehicle charging performance?
|
||
3. Provide regulations on vehicle communication security
|
||
4. Definition of L2 and L3 in autonomous driving
|
||
|
||
#### **🎨 Technical Implementation**
|
||
- **Custom Components**: Created `ExampleQuestionButton` component with proper TypeScript typing
|
||
- **CSS Enhancements**: Added responsive grid styles with mobile optimization
|
||
- **Architecture**: Seamlessly integrated with existing assistant-ui framework patterns
|
||
- **Language Detection**: Automatic language switching via URL parameters and browser detection
|
||
|
||
## v1.0.2 - 2025-08-26 🔧
|
||
|
||
### 🔧 **Error Handling & Code Quality Improvements**
|
||
|
||
#### **🛡️ DRY Error Handling System**
|
||
- **Backend Error Handler**: Added unified `error_handler.py` module with structured logging, decorators, and error categorization
|
||
- **Frontend Error Components**: Created ErrorBoundary and ErrorToast components with TypeScript support
|
||
- **Error Middleware**: Implemented centralized error handling middleware for FastAPI
|
||
- **Structured Logging**: JSON-formatted logs with timezone-aware timestamps
|
||
- **User-Friendly Messages**: Categorized error types (error/warning/network) with appropriate UI feedback
|
||
|
||
#### **🌐 Error Message Internationalization**
|
||
- **English Default**: All user-facing error messages now default to English for better accessibility
|
||
- **Consistent Messaging**: Updated error handler to provide clear, professional English error messages
|
||
- **Frontend Updates**: ErrorBoundary component now displays English error messages
|
||
- **Backend Messages**: Standardized API error responses in English across all endpoints
|
||
|
||
#### **🐛 Bug Fixes**
|
||
- **Configuration Loading**: Fixed `NameError: 'config' is not defined` in `main.py` by restructuring config loading order
|
||
- **Service Startup**: Resolved backend startup issues in both foreground and background modes
|
||
- **Deprecation Warnings**: Updated `datetime.utcnow()` to `datetime.now(timezone.utc)` for future compatibility
|
||
- **Type Safety**: Fixed TypeScript type conflicts in frontend error handling components
|
||
|
||
#### **🔄 Code Optimizations**
|
||
- **DRY Principles**: Eliminated code duplication in error handling across backend and frontend
|
||
- **Modular Architecture**: Separated error handling concerns into reusable, testable modules
|
||
- **Component Separation**: Split Toast functionality into distinct hook and component files
|
||
- **Clean Code**: Applied consistent naming conventions and removed redundant imports
|
||
|
||
---
|
||
|
||
## v1.0.1 - 2025-08-26 🔧
|
||
|
||
### 🔧 **Configuration Management Improvements**
|
||
|
||
#### **📋 Environment Configuration Extraction**
|
||
- **Centralized Configuration**: Extracted hardcoded environment settings to `config.yaml`
|
||
- `max_tool_rounds`: Maximum tool calling rounds (configurable, default: 3)
|
||
- `service.host` & `service.port`: Service binding configuration
|
||
- `search.standard_regulation_index` & `search.chunk_index`: Search index names
|
||
- `citation.base_url`: Citation link base URL for CAT system
|
||
- **Code Optimization**: Reduced duplicate `get_config()` calls in `graph.py` with module-level caching
|
||
- **Enhanced Maintainability**: Environment-specific values now externalized for easier deployment management
|
||
|
||
#### **🚀 Performance Optimizations**
|
||
- **Configuration Caching**: Implemented `get_cached_config()` to avoid repeated configuration loading
|
||
- **Reduced Code Duplication**: Eliminated 4 duplicate `get_config()` calls across the workflow
|
||
- **Memory Efficiency**: Single configuration instance shared across the application
|
||
|
||
#### **✅ Quality Assurance**
|
||
- **Comprehensive Testing**: All configuration changes validated with existing test suite
|
||
- **Backward Compatibility**: No breaking changes to API or functionality
|
||
- **Configuration Validation**: Added verification of configuration loading and usage
|
||
|
||
---
|
||
|
||
## v1.0.0 - 2025-08-25 🎉
|
||
|
||
### 🚀 **STABLE RELEASE** - Agentic RAG System for Standards & Regulations
|
||
|
||
This marks the first stable release of our **Agentic RAG System** - a production-ready AI assistant for enterprise standards and regulations search and management.
|
||
|
||
---
|
||
|
||
### 🎯 **Core Features**
|
||
|
||
#### **🤖 Autonomous Agent Architecture**
|
||
- **LangGraph-Powered Workflow**: Multi-step autonomous agent using LangGraph OSS for intelligent tool orchestration
|
||
- **2-Phase Retrieval Strategy**: Intelligent metadata discovery followed by detailed content retrieval
|
||
- **Parallel Tool Execution**: Optimized parallel query processing for maximum information coverage
|
||
- **Multi-Round Intelligence**: Adaptive retrieval rounds based on information gaps and user requirements
|
||
|
||
#### **🔍 Advanced Retrieval System**
|
||
- **Dual Retrieval Tools**:
|
||
- `retrieve_standard_regulation`: Standards/regulations metadata discovery
|
||
- `retrieve_doc_chunk_standard_regulation`: Detailed document content chunks
|
||
- **Smart Query Optimization**: Automatic sub-query generation with bilingual support (Chinese/English)
|
||
- **Version Management**: Intelligent selection of latest published and current versions
|
||
- **Hybrid Search Integration**: Optimized for Azure AI Search's keyword + vector search capabilities
|
||
|
||
#### **💬 Real-time Streaming Interface**
|
||
- **Server-Sent Events (SSE)**: Real-time streaming responses with tool execution visibility
|
||
- **Assistant-UI Integration**: Modern conversational interface with tool call visualization
|
||
- **Progressive Enhancement**: Token-by-token streaming with tool progress indicators
|
||
- **Citation Tracking**: Real-time citation mapping and reference management
|
||
|
||
---
|
||
|
||
### 🛠 **Technical Architecture**
|
||
|
||
#### **Backend (Python + FastAPI)**
|
||
- **FastAPI Framework**: High-performance async API with comprehensive CORS support
|
||
- **PostgreSQL Memory**: Persistent conversation history with 7-day TTL
|
||
- **Configuration Management**: YAML-based configuration with environment variable support
|
||
- **Structured Logging**: JSON-formatted logs with request tracing and performance metrics
|
||
|
||
#### **Frontend (Next.js + Assistant-UI)**
|
||
- **Next.js 15**: Modern React framework with optimized performance
|
||
- **Assistant-UI Components**: Pre-built conversational UI elements with streaming support
|
||
- **Markdown Rendering**: Enhanced markdown with LaTeX formula support and external links
|
||
- **Responsive Design**: Mobile-friendly interface with dark/light theme support
|
||
|
||
#### **AI/ML Pipeline**
|
||
- **LLM Support**: OpenAI and Azure OpenAI integration with configurable models
|
||
- **Prompt Engineering**: Sophisticated system prompts with context-aware instructions
|
||
- **Citation System**: Automatic citation mapping with source tracking
|
||
- **Error Handling**: Graceful fallbacks with constructive user guidance
|
||
|
||
---
|
||
|
||
### 🔧 **Production Features**
|
||
|
||
#### **Memory & State Management**
|
||
- **PostgreSQL Integration**: Robust conversation persistence with automatic cleanup
|
||
- **Session Management**: User session isolation with configurable TTL
|
||
- **State Recovery**: Conversation context restoration across sessions
|
||
|
||
#### **Monitoring & Observability**
|
||
- **Structured Logging**: Comprehensive request/response logging with timing metrics
|
||
- **Error Tracking**: Detailed error reporting with stack traces and context
|
||
- **Performance Metrics**: Token usage tracking and response time monitoring
|
||
|
||
#### **Security & Reliability**
|
||
- **Input Validation**: Comprehensive request validation and sanitization
|
||
- **Rate Limiting**: Built-in protection against abuse
|
||
- **Error Isolation**: Graceful error handling without system crashes
|
||
- **Configuration Security**: Environment-based secrets management
|
||
|
||
---
|
||
|
||
### 📊 **Performance Metrics**
|
||
|
||
- **Response Time**: < 200ms for token streaming initiation
|
||
- **Context Capacity**: 100k tokens for extended conversations
|
||
- **Tool Efficiency**: Optimized "mostly 2" parallel queries strategy
|
||
- **Memory Management**: 7-day conversation retention with automatic cleanup
|
||
- **Concurrent Users**: Designed for enterprise-scale deployment
|
||
|
||
---
|
||
|
||
### 🎨 **User Experience**
|
||
|
||
#### **Intelligent Interaction**
|
||
- **Bilingual Support**: Seamless Chinese/English query processing and responses
|
||
- **Visual Content**: Smart image relevance checking and embedding
|
||
- **Citation Excellence**: Professional citation mapping with source links
|
||
- **Error Recovery**: Constructive suggestions when information is insufficient
|
||
|
||
#### **Professional Interface**
|
||
- **Tool Visualization**: Real-time tool execution progress with clear status indicators
|
||
- **Document Previews**: Rich preview of retrieved standards and regulations
|
||
- **Export Capabilities**: Easy copying and sharing of responses with citations
|
||
- **Accessibility**: WCAG-compliant interface design
|
||
|
||
---
|
||
|
||
### 🔄 **Deployment & Operations**
|
||
|
||
#### **Development Workflow**
|
||
- **UV Package Manager**: Fast, Rust-based Python dependency management
|
||
- **Hot Reload**: Development server with automatic code reloading
|
||
- **Testing Suite**: Comprehensive unit and integration tests
|
||
- **Documentation**: Complete API documentation and user guides
|
||
|
||
#### **Production Deployment**
|
||
- **Docker Support**: Containerized deployment with multi-stage builds
|
||
- **Environment Configuration**: Flexible configuration for different deployment environments
|
||
- **Health Checks**: Built-in health monitoring endpoints
|
||
- **Scaling Ready**: Designed for horizontal scaling and load balancing
|
||
|
||
---
|
||
|
||
### 📈 **Business Impact**
|
||
|
||
- **Enterprise Ready**: Production-grade system for standards and regulations management
|
||
- **Efficiency Gains**: Automated intelligent search replacing manual document review
|
||
- **Accuracy Improvement**: AI-powered relevance filtering and version management
|
||
- **User Satisfaction**: Intuitive interface with professional citation handling
|
||
- **Scalability**: Architecture supports growing enterprise needs
|
||
|
||
---
|
||
|
||
### 🎁 **What's Included**
|
||
|
||
- ✅ Complete source code with documentation
|
||
- ✅ Production deployment configurations
|
||
- ✅ Comprehensive testing suite
|
||
- ✅ User and administrator guides
|
||
- ✅ API documentation and examples
|
||
- ✅ Docker containerization setup
|
||
- ✅ Monitoring and logging configurations
|
||
|
||
---
|
||
|
||
### 🚀 **Getting Started**
|
||
|
||
```bash
|
||
# Clone and setup
|
||
git clone <repository>
|
||
cd agentic-rag-4
|
||
|
||
# Install dependencies
|
||
uv sync
|
||
|
||
# Configure environment
|
||
cp config.yaml.example config.yaml
|
||
# Edit config.yaml with your settings
|
||
|
||
# Start services
|
||
make dev-backend # Start backend service
|
||
make dev-web # Start frontend interface
|
||
|
||
# Access the application
|
||
open http://localhost:3000
|
||
```
|
||
|
||
---
|
||
|
||
**🎉 Thank you to all contributors who made this stable release possible!**
|
||
|
||
## v0.11.4 - 2025-08-25
|
||
|
||
### 📝 LLM Prompt Restructuring and Optimization
|
||
- **Major Workflow Restructuring**: Reorganized retrieval strategy for better clarity and efficiency
|
||
- **Simplified Workflow Structure**: Restructured "2-Phase Retrieval Strategy" section with clearer organization
|
||
- Combined retrieval phases under unified "Retrieval Strategy (for Standards/Regulations)" section
|
||
- Moved multi-round strategy explanation to the beginning for better flow
|
||
- **Enhanced Context Parameters**: Updated max_context_length from 96k to 100k tokens for better conversation handling
|
||
- **Query Strategy Optimization**: Refined sub-query generation approach
|
||
- Changed from "2-3 parallel rewritten queries" to "parallel rewritten queries" for flexibility
|
||
- Specified "2-3(mostly 2)" for sub-query generation to optimize efficiency
|
||
- Reorganized language mixing strategy placement for better readability
|
||
- **Duplicate Rule Consolidation**: Added version selection rule to synthesis phase (step 4) for consistency
|
||
- Ensures version prioritization applies throughout the entire workflow, not just metadata discovery
|
||
- **Enhanced Error Handling**: Improved "No-Answer with Suggestions" section
|
||
- Added specific guidance to "propose 3–5 example rewrite queries" for better user assistance
|
||
|
||
### 🔧 Technical Improvements
|
||
- **Query Optimization**: Streamlined sub-query generation process for better performance
|
||
- **Workflow Consistency**: Ensured version selection rules apply consistently across all workflow phases
|
||
- **Parameter Tuning**: Increased context window capacity for handling longer conversations
|
||
|
||
### 🎯 Quality Enhancements
|
||
- **User Guidance**: Enhanced fallback suggestions with specific query rewrite examples
|
||
- **Retrieval Efficiency**: Optimized parallel query generation strategy
|
||
- **Version Management**: Extended version selection logic to synthesis phase for comprehensive coverage
|
||
|
||
### 📊 Impact
|
||
- **Performance**: More efficient query generation with "mostly 2" sub-queries approach
|
||
- **Consistency**: Unified version selection behavior across all workflow phases
|
||
- **User Experience**: Better guidance when retrieval yields insufficient results
|
||
- **Scalability**: Increased context capacity supports longer conversation histories
|
||
|
||
## v0.11.3 - 2025-08-25
|
||
|
||
### 📝 LLM Prompt Enhancement - Version Selection Rules
|
||
- **Standards/Regulations Version Management**: Added intelligent version selection logic to Phase 1 metadata discovery
|
||
- **Version Selection Rule**: Added rule to handle multiple versions of the same standard/regulation
|
||
- When retrieval results contain similar items (likely different versions), default to the latest published and current version
|
||
- Only applies when user hasn't specified a particular version requirement
|
||
- **Image Processing Enhancement**: Improved visual content handling instructions
|
||
- Added relevance check by reviewing `<figcaption>` before embedding images
|
||
- Ensures only relevant figures/images are included in responses
|
||
- **Terminology Refinement**: Updated "official version" to "published and current version" for better precision
|
||
- Reflects the concept of "发布的现行" - emphasizing both official publication and current validity
|
||
|
||
### 🎯 Quality Improvements
|
||
- **Smart Version Prioritization**: Enhanced metadata discovery to automatically select the most appropriate document versions
|
||
- **Visual Content Validation**: Added systematic approach to verify image relevance before inclusion
|
||
- **Linguistic Precision**: Improved terminology to better reflect regulatory document status
|
||
|
||
### 📊 Impact
|
||
- **User Experience**: Reduces confusion when multiple document versions are available
|
||
- **Content Quality**: Ensures responses include only relevant visual aids
|
||
- **Regulatory Accuracy**: Better alignment with how regulatory documents are categorized and prioritized
|
||
|
||
## v0.11.2 - 2025-08-24
|
||
|
||
### 🔧 Configuration and Development Workflow Improvements
|
||
- **LLM Prompt Configuration**: Enhanced prompt wording and removed redundant "ALWAYS" requirement for Phase 2 retrieval
|
||
- **Workflow Flexibility**: Changed "ALWAYS follow this 2-phase strategy for ANY standards/regulations query" to "Follow this 2-phase strategy for standards/regulations query"
|
||
- **Phase Organization**: Reordered Phase 1 metadata discovery sections for better logical flow (Purpose → Tool → Query strategy)
|
||
- **Clearer Tool Description**: Enhanced Phase 2 tool description for better clarity
|
||
- **Sub-query Generation**: Improved instructions for generating different rewritten sub-queries
|
||
- **Configuration Updates**:
|
||
- **Tool Loop Limit**: Commented out `max_tool_loops` setting in config to use default value (5 instead of 10)
|
||
- **Service Configuration**: Updated default `max_tool_loops` from 3 to 5 in AppConfig for better balance
|
||
- **Frontend Dependencies**: Added `rehype-raw` dependency for enhanced HTML processing in markdown rendering
|
||
|
||
### 🎯 Code Organization
|
||
- **Development Workflow**: Enhanced prompt management and configuration structure
|
||
- **Documentation**: Updated project structure to reflect latest changes and improvements
|
||
- **Dependencies**: Added necessary frontend packages for improved markdown and HTML processing
|
||
|
||
### 📝 Development Notes
|
||
- **Prompt Engineering**: Refined retrieval strategy instructions for more flexible execution
|
||
- **Configuration Management**: Simplified configuration by using sensible defaults
|
||
- **Frontend Enhancement**: Added support for raw HTML processing in markdown content
|
||
|
||
## v0.11.1 - 2025-08-24
|
||
|
||
### 📝 LLM Prompt Optimization
|
||
- **English Wording Improvements**: Comprehensive optimization of LLM prompt for better clarity and professional tone
|
||
- **Grammar and Articles**: Fixed grammatical issues and article usage throughout the prompt
|
||
- "for CATOnline system" → "for **the** CATOnline system"
|
||
- "information got from retrieval tools" → "information **retrieved from** search tools"
|
||
- "CATOnline is an standards" → "CATOnline is **a** standards"
|
||
- **Word Choice Enhancement**: Improved vocabulary and clarity
|
||
- "anwser questions" → "**answer** questions" (spelling correction)
|
||
- "Give a Citations Mapping" → "**Provide** a Citations Mapping"
|
||
- "Response in the user's language" → "**Respond** in the user's language"
|
||
- "refuse and redirect" → "**decline** and redirect"
|
||
- **Improved Flow and Structure**: Enhanced readability and professional presentation
|
||
- "maintain core intent" → "maintain **the** core intent"
|
||
- "in the below exact format" → "in the exact format **below**"
|
||
- "citations_map is as:" → "citations_map **is:**"
|
||
- **Technical Accuracy**: Fixed technical description issues in Phase 2 query strategy
|
||
- **Consistency**: Ensured parallel structure and consistent terminology throughout
|
||
|
||
### 🎯 Quality Improvements
|
||
- **Professional Tone**: Enhanced overall professionalism of AI assistant instructions
|
||
- **Clarity**: Improved instruction clarity for better LLM understanding and execution
|
||
- **Readability**: Better structured sections with clearer headings and formatting
|
||
|
||
## v0.11.0 - 2025-08-24
|
||
|
||
### 🔧 HTML Comment Filtering Fix
|
||
- **Streaming Response Cleanup**: Fixed HTML comments leaking to client in streaming responses
|
||
- **Robust HTML Comment Removal**: Implemented comprehensive filtering using regex pattern `<!--.*?-->` with DOTALL flag
|
||
- **Citations Map Protection**: Specifically prevents `<!-- citations_map ... -->` comments from reaching client
|
||
- **Multi-Point Filtering**: Applied filtering in both `call_model` and `post_process_node` functions
|
||
- **Token Accumulation Strategy**: Enhanced streaming logic to accumulate tokens and batch-filter HTML comments
|
||
|
||
### 🛡️ Security and Data Integrity
|
||
- **Client-Side Protection**: Ensured no internal processing comments are exposed to end users
|
||
- **Citation Processing**: Maintained proper citation functionality while filtering internal metadata
|
||
- **Content Integrity**: Preserved all legitimate markdown content including citation links and references
|
||
|
||
### 🧪 Comprehensive Validation
|
||
- **HTML Comment Filtering Test**: Created dedicated test script `test_html_comment_filtering.py`
|
||
- **1700+ Event Analysis**: Validated 1714 streaming events with zero HTML comment leakage
|
||
- **Real HTTP API Testing**: Used actual streaming endpoint for authentic validation
|
||
- **Pattern Detection**: Comprehensive regex pattern matching for all HTML comment variations
|
||
- **All Existing Tests Maintained**: Confirmed no regression in existing functionality
|
||
- **Unit Tests**: 41/41 passing ✅
|
||
- **Multi-Round Tool Calls**: Working correctly ✅
|
||
- **2-Phase Retrieval**: Functioning as expected ✅
|
||
- **Streaming Response**: Clean and efficient ✅
|
||
|
||
### 📊 Technical Implementation Details
|
||
- **Streaming Logic Enhancement**:
|
||
```python
|
||
# Remove HTML comments while preserving content
|
||
content = re.sub(r'<!--.*?-->', '', content, flags=re.DOTALL)
|
||
```
|
||
- **Performance Optimization**: Minimal impact on streaming performance through efficient regex processing
|
||
- **Error Handling**: Robust handling of edge cases in comment filtering
|
||
- **Backward Compatibility**: Full compatibility with existing citation and markdown processing
|
||
|
||
### 🎯 Quality Assurance Results
|
||
- **Zero HTML Comments**: No `<!-- citations_map ... -->` or other HTML comments found in client output
|
||
- **Citation Functionality**: All citation links and references render correctly
|
||
- **Streaming Performance**: No degradation in response time or user experience
|
||
- **Cross-Platform Testing**: Validated on multiple query types and response patterns
|
||
|
||
## v0.10.0 - 2025-08-24
|
||
|
||
### 🎯 Optimal Multi-Round Architecture Implementation
|
||
- **Streaming Only at Final Step**: Refactored architecture to follow optimal "streaming only at final step" pattern
|
||
- **Non-Streaming Planning**: All tool calling phases now use non-streaming LLM calls for better stability
|
||
- **Streaming Final Synthesis**: Only the final response generation step streams to the user
|
||
- **Tool Results Accumulation**: Enhanced AgentState with `Annotated[List[Dict[str, Any]], reducer]` for proper tool result aggregation
|
||
- **Temporary Tool Disabling**: Tools are automatically disabled during final synthesis phase to prevent infinite loops
|
||
- **Simplified Routing Logic**: Streamlined `should_continue` logic based on tool_calls presence rather than complex state checks
|
||
|
||
### 🔧 Architecture Optimization
|
||
- **Enhanced State Management**: Improved AgentState design for robust multi-round execution
|
||
- Added `tool_results` accumulation with proper reducer function
|
||
- Enhanced `tool_rounds` tracking with automatic increment logic
|
||
- Simplified state updates and transitions between agent and tools nodes
|
||
- **Tool Execution Improvements**: Refined parallel tool execution and error handling
|
||
- Fixed tool disabling logic to prevent termination issues
|
||
- Enhanced logging for better debugging and monitoring
|
||
- Improved tool result processing and aggregation
|
||
- **Graph Flow Optimization**: Streamlined workflow routing for better reliability
|
||
- Simplified conditional routing logic
|
||
- Enhanced error handling and recovery mechanisms
|
||
- Improved final synthesis triggering and tool state management
|
||
|
||
### 🧪 Comprehensive Test Validation
|
||
- **All Tests Passing**: Achieved 100% test success rate across all test categories
|
||
- **Unit Tests**: 41/41 passed - Core functionality validated
|
||
- **Script Tests**: 10/10 passed - Multi-round, streaming, and 2-phase retrieval confirmed
|
||
- **Integration Tests**: Properly skipped (service-dependent tests)
|
||
- **Test Framework Improvements**: Enhanced script tests with proper async pytest decorators
|
||
- Fixed import order and pytest.mark.asyncio decorators in all script test files
|
||
- Resolved async function compatibility issues
|
||
- Improved test reliability and execution speed
|
||
|
||
### ✅ Feature Validation Complete
|
||
- **Multi-Round Tool Calls**: ✅ Automatic execution of 1-3 rounds confirmed via service logs
|
||
- **Parallel Tool Execution**: ✅ Concurrent tool execution within each round validated
|
||
- **2-Phase Retrieval Strategy**: ✅ Both metadata and content retrieval tools used systematically
|
||
- **Streaming Response**: ✅ Final response streams properly after all tool execution
|
||
- **Error Handling**: ✅ Robust error handling for tool failures, timeouts, and edge cases
|
||
- **Tool State Management**: ✅ Proper tool disabling during synthesis prevents infinite loops
|
||
|
||
### 📝 Documentation Updates
|
||
- **Implementation Notes**: Updated documentation to reflect optimal architecture
|
||
- **Test Coverage**: Comprehensive documentation of test validation results
|
||
- **Service Logs**: Confirmed multi-round behavior through actual service execution logs
|
||
|
||
## v0.9.0 - 2025-08-24
|
||
|
||
### 🎯 Multi-Round Parallel Tool Calling Implementation
|
||
- **Auto Multi-Round Tool Execution**: Implemented true automatic multi-round parallel tool calling capability
|
||
- Added `tool_rounds` and `max_tool_rounds` tracking to `AgentState` (default: 3 rounds)
|
||
- Enhanced agent node with round-based tool calling logic and round limits
|
||
- Fixed workflow routing to ensure final synthesis after completing all tool rounds
|
||
- Agent can now automatically execute multiple rounds of tool calls within a single user interaction
|
||
- Each round supports parallel tool execution for maximum efficiency
|
||
|
||
### 🔍 2-Phase Retrieval Strategy Enforcement
|
||
- **Mandatory 2-Phase Retrieval**: Fixed agent to consistently follow 2-phase retrieval for content queries
|
||
- **Phase 1**: Metadata discovery using `retrieve_standard_regulation`
|
||
- **Phase 2**: Content chunk retrieval using `retrieve_doc_chunk_standard_regulation`
|
||
- Updated system prompt to make 2-phase retrieval mandatory for content-focused queries
|
||
- Enhanced query construction with document_code filtering for Phase 2
|
||
- Agent now correctly uses both tools for queries requiring detailed content (testing methods, procedures, requirements)
|
||
|
||
### 🧪 Comprehensive Testing Framework
|
||
- **Multi-Round Test Suite**: Created extensive test scripts to validate new functionality
|
||
- `test_2phase_retrieval.py`: Validates both metadata and content retrieval phases
|
||
- `test_multi_round_tool_calls.py`: Tests multi-round automatic tool calling behavior
|
||
- `test_streaming_multi_round.py`: Confirms streaming works with multi-round execution
|
||
- All tests confirm proper parallel execution and multi-round behavior
|
||
|
||
### 🔧 Technical Enhancements
|
||
- **Workflow Routing Logic**: Improved `should_continue()` function for proper multi-round flow
|
||
- Enhanced routing logic to handle tool completion and round progression
|
||
- Fixed final synthesis routing after maximum rounds reached
|
||
- Maintained streaming response capability throughout multi-round execution
|
||
- **State Management**: Enhanced AgentState with round tracking and management
|
||
- **Tool Integration**: Verified both retrieval tools work correctly in multi-round scenarios
|
||
|
||
### ✅ Validation Results
|
||
- **Multi-Round Capability**: ✅ Agent executes 1-3 rounds of tool calls automatically
|
||
- **Parallel Execution**: ✅ Tools execute in parallel within each round
|
||
- **2-Phase Retrieval**: ✅ Agent uses both metadata and content retrieval tools
|
||
- **Streaming Response**: ✅ Full streaming support maintained throughout workflow
|
||
- **Round Management**: ✅ Proper progression and final synthesis after max rounds
|
||
|
||
## v0.8.7 - 2025-08-24
|
||
|
||
### 🛠 Tool Modularization
|
||
- **Tool Code Organization**: Extracted tool definitions and schemas into separate module
|
||
- Created new `service/graph/tools.py` module containing all tool implementations
|
||
- Moved `retrieve_standard_regulation` and `retrieve_doc_chunk_standard_regulation` functions
|
||
- Added `get_tool_schemas()` and `get_tools_by_name()` utility functions
|
||
- Updated `service/graph/graph.py` to import tools from the new module
|
||
- Updated test imports to reference tools from the correct module location
|
||
- Improved code maintainability and separation of concerns
|
||
|
||
## v0.8.6 - 2025-08-24
|
||
|
||
### 🔧 Configuration Restructuring
|
||
- **LLM Configuration Separation**: Extracted LLM parameters and prompt templates to dedicated `llm_prompt.yaml`
|
||
- Created new `llm_prompt.yaml` file containing parameters and prompts sections
|
||
- Added support for loading both `config.yaml` and `llm_prompt.yaml` configurations
|
||
- Enhanced configuration models with `LLMParametersConfig` and `LLMPromptsConfig`
|
||
- Added `get_max_context_length()` method for consistent context length access
|
||
- Updated `message_trimmer.py` to use new configuration structure
|
||
- Maintains backward compatibility with legacy configuration format
|
||
|
||
### 📂 File Structure Changes
|
||
- **New file**: `llm_prompt.yaml` - Contains all LLM-related parameters and prompt templates
|
||
- **Updated**: `service/config.py` - Enhanced to support dual configuration files
|
||
- **Updated**: `service/graph/message_trimmer.py` - Uses new configuration method
|
||
|
||
## v0.8.5 - 2025-08-24
|
||
|
||
### 🚀 Performance Improvements
|
||
- **Parallel Tool Execution**: Fixed sequential tool calling to implement true parallel execution
|
||
- Modified `run_tools_with_streaming()` to use `asyncio.gather()` for concurrent tool calls
|
||
- Added proper error handling and result aggregation for parallel execution
|
||
- Improved tool execution performance when LLM calls multiple tools simultaneously
|
||
- Enhanced logging to track parallel execution completion
|
||
|
||
### 🔧 Technical Enhancements
|
||
- **Query Optimization Strategy**: Enhanced agent prompt to encourage multiple parallel tool calls
|
||
- Agent now generates 1-3 rewritten queries before retrieval
|
||
- Cross-language query generation (Chinese ↔ English) for broader coverage
|
||
- Optimized for Azure AI Search's Hybrid Search capabilities
|
||
- True parallel tool calling implementation in LangGraph workflow
|
||
|
||
## v0.8.4 - 2025-08-24
|
||
|
||
### 🚀 Agent Intelligence Improvements
|
||
- **Advanced Query Rewriting Strategy**: Enhanced agent system prompt with intelligent query optimization
|
||
- Added mandatory query rewriting step before retrieval tool calls
|
||
- Generates 1-3 rewritten queries to explore different aspects of user intent
|
||
- Cross-language query generation (Chinese ↔ English) for broader search coverage
|
||
- Optimized queries for Azure AI Search's Hybrid Search (keyword + vector search)
|
||
- Parallel retrieval tool calling for comprehensive information gathering
|
||
- Enhanced coverage through synonyms, technical terms, and alternative phrasings
|
||
|
||
## v0.8.3 - 2025-08-24
|
||
|
||
### 🎨 UI/UX Improvements
|
||
- **Citation Format Update**: Changed citation format from superscript HTML tags `<sup>1</sup>` to square brackets `[1]`
|
||
- Updated agent system prompt to use square bracket citations for improved readability
|
||
- Modified citation examples in configuration to reflect new format
|
||
- Enhanced Markdown compatibility with bracket-style citations
|
||
|
||
### 🔧 Configuration Updates
|
||
- **Agent System Prompt Optimization**: Enhanced prompt engineering for better query rewriting capabilities
|
||
- Added support for generating 1-3 rewritten queries based on conversation context
|
||
- Improved parallel tool calling workflow for comprehensive information retrieval
|
||
- Added cross-language query generation (Chinese ↔ English) for broader search coverage
|
||
- Optimized query text for Azure AI Search's Hybrid Search (keyword + vector search)
|
||
|
||
## v0.8.2 - 2025-08-24
|
||
|
||
### 🐛 Code Quality Fixes
|
||
- **Removed Duplicate Route Definitions**: Fixed main.py having duplicate endpoint definitions
|
||
- Removed duplicate `/api/chat`, `/api/ai-sdk/chat`, `/health`, and `/` route definitions
|
||
- Removed duplicate `if __name__ == "__main__"` blocks
|
||
- Standardized `/api/chat` endpoint to use proper SSE configuration (`text/event-stream`)
|
||
- **Code Deduplication**: Cleaned up redundant code that could cause routing conflicts
|
||
- **Consistent Headers**: Unified streaming response headers for better browser compatibility
|
||
|
||
## v0.8.1 - 2025-08-24
|
||
|
||
### 🧪 Integration Test Modernization
|
||
- **Complete Integration Test Rewrite**: Modernized all integration tests to match latest codebase features
|
||
- **Remote Service Testing**: All integration tests now connect to running service at `http://localhost:8000` using `httpx.AsyncClient`
|
||
- **LangGraph v0.6+ Compatibility**: Updated streaming contract validation for latest LangGraph features
|
||
- **PostgreSQL Memory Testing**: Added session persistence testing with PostgreSQL backend
|
||
- **AI SDK Endpoints**: Comprehensive testing of `/api/chat` and `/api/ai-sdk/chat` endpoints
|
||
|
||
### 🔄 Test Infrastructure Updates
|
||
- **Modern Async Patterns**: Converted all tests to use `pytest.mark.asyncio` and async/await
|
||
- **Server-Sent Events (SSE)**: Added streaming response validation with proper SSE format parsing
|
||
- **Citation Processing**: Testing of citation CSV format and tool result aggregation
|
||
- **Concurrent Testing**: Multi-session and rapid-fire request testing for performance validation
|
||
|
||
### 📁 Test File Organization
|
||
- **`test_api.py`**: Basic API endpoints, request validation, CORS/security headers, error handling
|
||
- **`test_full_workflow.py`**: End-to-end workflows, session continuity, real-world scenarios
|
||
- **`test_streaming_integration.py`**: Streaming behavior, performance, concurrent requests, content validation
|
||
- **`test_e2e_tool_ui.py`**: Complete tool UI workflows, multi-turn conversations, specialized queries
|
||
- **`test_mocked_streaming.py`**: Mocked streaming tests for internal validation without external dependencies
|
||
|
||
### 🎯 Test Coverage Enhancements
|
||
- **Real-World Scenarios**: Compliance officer and engineer research workflow testing
|
||
- **Performance Testing**: Response timing, large context handling, rapid request sequences
|
||
- **Error Recovery**: Session recovery after errors, timeout handling, malformed request validation
|
||
- **Content Validation**: Unicode support, encoding verification, response consistency testing
|
||
|
||
### ⚙️ Test Execution
|
||
- **Service Dependency**: Integration tests require running service (fail appropriately when service unavailable)
|
||
- **Flag-based Execution**: Use `--run-integration` flag to execute integration tests
|
||
- **Comprehensive Validation**: All tests validate response structure, streaming format, and business logic
|
||
|
||
## v0.8.0 - 2025-08-23
|
||
|
||
### 🚀 Major Changes - PostgreSQL Migration
|
||
- **Breaking Change**: Migrated session memory storage from Redis to PostgreSQL
|
||
- **Complete removal of Redis dependencies**: Removed `redis` and `langgraph-checkpoint-redis` packages
|
||
- **New PostgreSQL-based session persistence**: Using `langgraph-checkpoint-postgres` for robust session management
|
||
- **Azure Database for PostgreSQL**: Configured for production Azure environment with SSL security
|
||
- **7-day TTL**: Automatic cleanup of old conversation data with PostgreSQL-based retention policy
|
||
|
||
### 🔧 Session Memory Infrastructure
|
||
- **PostgreSQL Storage**: Implemented comprehensive session-level memory with PostgreSQL persistence
|
||
- Created `PostgreSQLCheckpointerWrapper` for complete LangGraph checkpointer interface compatibility
|
||
- Automatic schema migration and table creation via LangGraph PostgresSaver
|
||
- Robust connection pooling with `psycopg[binary]` driver
|
||
- Context-managed database connections with automatic cleanup
|
||
- **Backward Compatibility**: Full interface compatibility with existing Redis implementation
|
||
- All checkpointer methods (sync/async): `get`, `put`, `list`, `get_tuple`, `put_writes`, etc.
|
||
- Graceful fallback mechanisms for async methods not natively supported by PostgresSaver
|
||
- Thread-safe execution with proper async/sync method bridging
|
||
|
||
### 🛠️ Technical Improvements
|
||
- **Configuration Updates**:
|
||
- Added `postgresql` configuration section to `config.yaml`
|
||
- Removed `redis` configuration sections completely
|
||
- Updated all logging and comments from "Redis" to "PostgreSQL"
|
||
- **Memory Management**:
|
||
- `PostgreSQLMemoryManager` for conditional PostgreSQL/in-memory checkpointer initialization
|
||
- Connection testing and validation during startup
|
||
- Improved error handling with detailed logging and connection diagnostics
|
||
- **Code Architecture**:
|
||
- Updated `AgenticWorkflow` to use PostgreSQL checkpointer for session memory
|
||
- Fixed variable name conflicts in `ai_sdk_chat.py` (config vs graph_config)
|
||
- Proper state management using `TurnState` objects in workflow execution
|
||
|
||
### 🐛 Bug Fixes
|
||
- **Workflow Execution**: Fixed async method compatibility issues with PostgresSaver
|
||
- Resolved `NotImplementedError` for `aget_tuple` and other async methods
|
||
- Added fallback to sync methods with proper thread pool execution
|
||
- Fixed LangGraph integration with correct `AgentState` format usage
|
||
- **Session History**: Restored conversation memory functionality
|
||
- Fixed session history loading and persistence across conversation turns
|
||
- Verified multi-turn conversations correctly remember previous context
|
||
- Ensured proper message threading with session IDs
|
||
|
||
### 🧹 Cleanup & Maintenance
|
||
- **Removed Legacy Code**:
|
||
- Deleted `redis_memory.py` and all Redis-related implementations
|
||
- Cleaned up temporary test files and development artifacts
|
||
- Removed all `__pycache__` directories
|
||
- Deleted obsolete backup and version files
|
||
- **Updated Documentation**:
|
||
- All code comments updated from Redis to PostgreSQL references
|
||
- Logging messages updated to reflect PostgreSQL usage
|
||
- Maintained existing API documentation and interfaces
|
||
|
||
### ✅ Verification & Testing
|
||
- **Functional Testing**: All core features verified working with PostgreSQL backend
|
||
- Chat functionality with tool calling and streaming responses
|
||
- Session persistence across multiple conversation turns
|
||
- PostgreSQL schema auto-creation and TTL cleanup functionality
|
||
- Health check endpoints and service startup/shutdown procedures
|
||
- **Performance**: No degradation in response times or functionality
|
||
- Maintained all existing streaming capabilities
|
||
- Tool execution and result processing unchanged
|
||
- Citation processing and response formatting intact
|
||
|
||
### 📈 Impact
|
||
- **Production Ready**: Fully migrated from Redis to Azure Database for PostgreSQL
|
||
- **Scalability**: Better long-term data management with relational database benefits
|
||
- **Reliability**: Enhanced data consistency and backup capabilities through PostgreSQL
|
||
- **Maintainability**: Simplified dependency management with single database backend
|
||
|
||
---
|
||
|
||
## v0.7.9 - 2025-08-23
|
||
|
||
### 🐛 Bug Fixes
|
||
- **Fixed**: Syntax errors in `service/graph/graph.py`
|
||
- Fixed type annotation errors with message parameters by adding proper type casting
|
||
- Fixed graph.astream call type errors by using proper `RunnableConfig` and `AgentState` typing
|
||
- Added missing `cast` import for better type handling
|
||
- Ensured compatibility with LangGraph and LangChain type system
|
||
|
||
---
|
||
|
||
## v0.7.8 - 2025-08-23
|
||
|
||
### 🔧 Configuration Updates
|
||
- **Breaking Change**: Replaced `max_tokens` with `max_context_length` in configuration
|
||
- **Added**: Optional `max_output_tokens` setting for LLM response length control
|
||
- Default: `None` (no output token limit)
|
||
- When set: Applied as `max_tokens` parameter to LLM calls
|
||
- Provides flexibility to limit output length when needed
|
||
- Updated conversation history management to use 96k context length by default
|
||
- Improved token allocation: 85% for conversation history, 15% reserved for responses
|
||
|
||
### 🔄 Conversation Management
|
||
- Enhanced conversation trimmer to handle larger context windows
|
||
- Updated trimming strategy to allow ending on AI messages for better conversation flow
|
||
- Improved error handling and fallback mechanisms in message trimming
|
||
|
||
### 📝 Documentation
|
||
- Updated conversation history management documentation
|
||
- Clarified distinction between context length and output token limits
|
||
- Added examples for optional output token limiting
|
||
|
||
---
|
||
|
||
## v0.7.7 - 2025-08-23
|
||
|
||
### Added
|
||
- **Conversation History Management**: Implemented automatic context length management
|
||
- Added `ConversationTrimmer` class to handle conversation history trimming
|
||
- Integrated with LangChain's `trim_messages` utility for intelligent message truncation
|
||
- Automatic token counting and trimming to prevent context window overflow
|
||
- Preserves system messages and maintains conversation validity
|
||
- Fallback to message count-based trimming when token counting fails
|
||
- Configurable token limits with 70% allocation for conversation history
|
||
- Smart conversation flow preservation (starts with human, ends with human/tool)
|
||
|
||
### Enhanced
|
||
- **Context Window Protection**: Prevents API failures due to exceeded token limits
|
||
- Monitors conversation length and applies trimming when necessary
|
||
- Maintains conversation quality while respecting LLM context constraints
|
||
- Improves reliability for long-running conversations
|
||
|
||
## v0.7.6 - 2025-08-23
|
||
|
||
### Enhanced
|
||
- **Universal Tool Calling**: Implemented consistent forced tool calling across all query types
|
||
- Modified graph.py to always use `tool_choice="required"` for better DeepSeek compatibility
|
||
- Ensures reliable tool invocation for both technical and non-technical queries
|
||
- Provides consistent behavior across all LLM providers (Azure, OpenAI, DeepSeek)
|
||
- Maintains response quality while guaranteeing tool usage for retrieval-based queries
|
||
|
||
### Validated
|
||
- **DeepSeek Integration**: Comprehensive testing confirms optimal configuration
|
||
- Verified that ChatOpenAI with custom endpoints fully supports DeepSeek models
|
||
- Confirmed that forced tool calling resolves DeepSeek tool invocation issues
|
||
- Tested both technical queries (GB/T standards) and general queries (greetings)
|
||
- Established that current implementation requires no DeepSeek-specific handling
|
||
|
||
## v0.7.5 - 2025-01-18
|
||
|
||
### Improved
|
||
- **Code Simplification**: Removed unnecessary ChatDeepSeek dependency and complexity
|
||
- Simplified LLMClient to use only ChatOpenAI for all OpenAI-compatible endpoints (including custom DeepSeek)
|
||
- Removed unused `langchain-deepseek` dependency as ChatOpenAI handles custom DeepSeek endpoints perfectly
|
||
- Cleaned up _create_llm method by removing DeepSeek-specific handling logic
|
||
- Maintained full compatibility with existing tool calling functionality
|
||
- Code is now more maintainable and follows KISS principle
|
||
|
||
## v0.7.4 - 2025-08-23
|
||
|
||
### Fixed
|
||
- **OpenAI Provider Tool Calling**: Fixed DeepSeek model tool calling issues for custom endpoints
|
||
- Added `langchain-deepseek` dependency for better DeepSeek model support
|
||
- Modified LLMClient to use ChatOpenAI for custom DeepSeek endpoints (instead of ChatDeepSeek which only works with official api.deepseek.com)
|
||
- Implemented forced tool calling using `tool_choice="required"` for initial queries to ensure tool usage
|
||
- Enhanced agent system prompt to explicitly require tool usage for all information queries
|
||
- Resolved issue where DeepSeek models weren't calling tools consistently when using provider: openai
|
||
- Now both Azure and OpenAI providers (including custom DeepSeek endpoints) work correctly with tool calling
|
||
|
||
### Enhanced
|
||
- **System Prompt Optimization**: Improved agent prompts for better tool usage reliability
|
||
- Added explicit tool listing and mandatory workflow instructions
|
||
- Enhanced prompts specifically for GB/T standards and technical information queries
|
||
- Better handling of Chinese technical queries with forced tool retrieval
|
||
|
||
## v0.7.3 - 2025-08-23
|
||
|
||
### Fixed
|
||
- **Citation Display**: Fixed citation header visibility logic
|
||
- Modified `_build_citation_markdown` function to only display "### 📘 Citations:" header when valid citations exist
|
||
- Prevents empty citation sections from appearing when agent response doesn't contain citation mapping
|
||
- Improved user experience by removing unnecessary empty citation headers
|
||
|
||
## v0.7.2 - 2025-01-16
|
||
|
||
### Enhanced
|
||
- **Tool Conversation Context**: Added conversation history parameter support to retrieval tools
|
||
- Both `retrieve_standard_regulation` and `retrieve_doc_chunk_standard_regulation` now accept `conversation_history` parameter
|
||
- Enhanced agent node to autonomously use tools with conversation context for better multi-turn understanding
|
||
- Improved tool call responses with contextual information for citations mapping
|
||
- **Citation Processing**: Improved citation mapping and metadata handling
|
||
- Updated `_build_citation_markdown` to prioritize English titles over Chinese for internationalization
|
||
- Enhanced `_normalize_result` function with dynamic structure and selective field removal
|
||
- Removed noise fields (`@search.score`, `@search.rerankerScore`, `@search.captions`, `@subquery_id`) from tool responses
|
||
- Improved tool result metadata structure with `@tool_call_id` and `@order_num` for accurate citation mapping
|
||
- **Agent Optimization**: Refined autonomous agent workflow for better tool usage
|
||
- Function calling mode (not ReAct) to minimize LLM calls and token consumption
|
||
- Enhanced multi-step tool loops with improved context passing between tool calls
|
||
- Optimized retrieval API configurations with `include_trace: False` for cleaner responses
|
||
- **Session Management**: Improved session behavior for better user experience
|
||
- Changed session ID generation to create new session on every page refresh
|
||
- Switched from localStorage to sessionStorage for session ID persistence
|
||
- New sessions start fresh conversations while maintaining session isolation per browser tab
|
||
|
||
### Fixed
|
||
- **Tool Configuration**: Updated retrieval API field selections and search parameters
|
||
- Standardized field lists for `select`, `search_fields`, and `fields_for_gen_rerank` across tools
|
||
- Removed deprecated `timestamp` and `x_Standard_Code` fields from standard regulation tool
|
||
- Added missing metadata fields (`func_uuid`, `filepath`, `x_Standard_Regulation_Id`) for proper citation link generation
|
||
|
||
## v0.7.1 - 2025-01-16
|
||
|
||
### Fixed
|
||
- **Session Memory Bug**: Fixed critical multi-turn conversation context loss in webchat
|
||
- **Root Cause**: `ai_sdk_chat.py` was creating new `TurnState` for each request without loading previous conversation history from Redis/LangGraph memory
|
||
- **Additional Issue**: Frontend was generating new `session_id` for each request instead of maintaining persistent session
|
||
- **Solution**: Refactored to let LangGraph's checkpointer handle session history automatically using `thread_id`
|
||
- **Frontend Fix**: Added `useSessionId` hook to maintain persistent session ID in localStorage, passed via headers to backend
|
||
- **Implementation**: Removed manual state creation, pass only new user message and `session_id` to compiled graph
|
||
- **Validation**: Tested multi-turn conversations with same `session_id` - second message correctly references first message context
|
||
- **Session Isolation**: Verified different sessions maintain separate conversation contexts without cross-contamination
|
||
|
||
### Enhanced
|
||
- **Memory Integration**: Improved LangGraph session memory reliability
|
||
- Stream callback handling via contextvars for proper async streaming
|
||
- Automatic fallback to in-memory checkpointer when Redis modules unavailable
|
||
- Robust error handling for Redis connection issues while maintaining session functionality
|
||
- **Frontend Session Management**: Added persistent session ID management
|
||
- `useSessionId` React hook for localStorage-based session persistence
|
||
- Session ID passed via `X-Session-ID` header from frontend to backend
|
||
- Graceful fallback to generated session ID if none provided
|
||
|
||
## v0.7.0 - 2025-08-22
|
||
|
||
### Added
|
||
- **Redis Session Memory**: Implemented robust session-level memory with Redis persistence
|
||
- Redis-based chat history storage with 7-day TTL using Azure Cache for Redis
|
||
- LangGraph `RedisSaver` integration for session persistence and state management
|
||
- Graceful fallback to `InMemorySaver` if Redis is unavailable or modules missing
|
||
- Session-level memory isolation using `thread_id` for proper conversation context
|
||
- Config validation with dedicated `RedisConfig` model for connection parameters
|
||
- Session memory verification tests confirming isolation and persistence
|
||
|
||
### Enhanced
|
||
- **Memory Architecture**: Refactored from simple in-memory store to session-based graph memory
|
||
- Migrated from `InMemoryStore` to LangGraph's checkpoint system
|
||
- Updated `AgenticWorkflow` graph to use `MessagesState` with Redis persistence
|
||
- Added `RedisMemoryManager` for conditional Redis/in-memory checkpointer initialization
|
||
- Session-based conversation tracking via `session_id` as LangGraph `thread_id`
|
||
|
||
## v0.6.2 - 2025-08-22
|
||
|
||
### Added
|
||
- **Stream Filtering for Citations Mapping**: Implemented intelligent filtering of citations mapping HTML comments from token stream
|
||
- Agent-generated citations mapping is now filtered from the client-side stream while preserved in the complete response
|
||
- Added buffer-based detection of HTML comment boundaries (`<!--` and `-->`)
|
||
- Ensures citations mapping CSV remains available for post-processing while not displaying to users
|
||
- Maintains complete response integrity in state for `post_process_node` to access citations mapping
|
||
- Enhanced token streaming logic with comment detection and filtering state management
|
||
|
||
### Improved
|
||
- **Optimized Stream Buffering Logic**: Enhanced token filtering to minimize latency
|
||
- Non-comment tokens are now sent immediately to client without unnecessary buffering
|
||
- Only potential HTML comment prefixes (`<`, `<!`, `<!-`) are buffered for detection
|
||
- Reduced buffer size from 10 characters to 4 characters (minimum needed for `<!--`)
|
||
- Improved user experience with faster token delivery for normal content
|
||
- **Citation List Block Return**: Changed citation list delivery from character-by-character streaming to single block return
|
||
- Citations are now sent as a complete markdown block in post-processing
|
||
- Improved rendering performance and reduces UI jitter
|
||
- Better user experience with instant citation list appearance
|
||
|
||
### Technical
|
||
- **Stream Token Filtering Logic**: Enhanced `call_model` function in agent node with sophisticated filtering
|
||
- Implements intelligent buffering that only delays tokens when necessary for comment detection
|
||
- Maintains filtering state to handle multi-token HTML comments
|
||
- Preserves all content in response while selectively filtering stream output
|
||
- Compatible with existing streaming protocol and post-processing pipeline
|
||
|
||
## v0.6.1 - 2025-08-22
|
||
|
||
### Added
|
||
- **Citation List and Link Building**: Enhanced `post_process_node` to build complete citation lists with links
|
||
- Added citation mapping extraction from agent responses using CSV format in HTML comments
|
||
- Implemented citation markdown generation following `build_citations.py` logic
|
||
- Added automatic link generation for CAT system with proper URL encoding
|
||
- Added helper functions: `_extract_citations_mapping`, `_build_citation_markdown`, `_remove_citations_comment`
|
||
- **Frontend External Links Support**: Added `rehype-external-links` plugin for secure external link handling
|
||
- Installed `rehype-external-links` v3.0.0 dependency in web frontend
|
||
- Configured automatic `target="_blank"` and `rel="noopener noreferrer"` for external links
|
||
- Enhanced security and UX for citation links and external references
|
||
|
||
### Fixed
|
||
- **Chat UI Link Rendering**: Fixed links not being properly rendered in the chat interface
|
||
- Resolved component configuration conflict between `MyChat` and `AiAssistantMessage`
|
||
- Updated `AiAssistantMessage` to properly use `MarkdownText` component with external links support
|
||
- Added `@tailwindcss/typography` plugin for proper prose styling
|
||
- Enhanced link styling with blue color and hover effects
|
||
- Added intelligent content detection to handle both Markdown and HTML content
|
||
- Installed `isomorphic-dompurify` for safe HTML sanitization
|
||
- Enhanced Agent prompt to explicitly require Markdown-only output (no HTML tags)
|
||
|
||
### Changed
|
||
- **Enhanced Post-Processing**: `post_process_node` now processes citations mapping and generates structured citation lists
|
||
- Extracts citations mapping CSV from agent response HTML comments
|
||
- Builds proper citation markdown with document titles, headers, and clickable links
|
||
- Streams citation markdown to client for real-time display
|
||
- Maintains clean separation between agent response and citation processing
|
||
|
||
### Technical
|
||
- Added URL encoding support for document codes and titles
|
||
- Improved error handling in citation processing with fallback to error messages
|
||
- Maintained backward compatibility with existing streaming protocol
|
||
- Enhanced markdown rendering with proper external link security attributes
|
||
|
||
## v0.6.0 - 2025-08-22
|
||
|
||
### Changed
|
||
- **Removed `agent_done` event**: The streaming protocol no longer includes the deprecated `agent_done` event.
|
||
- Removed handling in `AISDKEventAdapter` (`service/ai_sdk_adapter.py`).
|
||
- Cleaned up commented-out `create_agent_done_event` in `service/sse.py` and related imports in `service/graph/graph.py`.
|
||
- Updated tests to no longer expect `agent_done` events across unit and integration suites.
|
||
|
||
### Technical
|
||
- Simplified adapter logic by eliminating obsolete event type handling.
|
||
- Version bump to reflect breaking change in streaming protocol.
|
||
|
||
## v0.5.3 - 2025-01-27
|
||
|
||
### Fixed
|
||
- **Tool Result Retrieval**: Fixed agent not receiving tool results correctly
|
||
- Fixed tool node serialization in `service/graph/graph.py`
|
||
- Tool results now passed directly as dicts to agent instead of using `model_dump()`
|
||
- Agent can now correctly retrieve and use tool results in conversation flow
|
||
- Verified through SSE stream testing that tool results are properly transmitted
|
||
|
||
## v0.5.2 - 2025-01-27
|
||
|
||
### Changed
|
||
- **Simplified Data Structure**: Rewrote `_normalize_result` function to return dynamic data structure
|
||
- Returns `Dict[str, Any]` instead of rigid `RetrievalResult` class
|
||
- Automatically removes search-specific fields: `@search.score`, `@search.rerankerScore`, `@search.captions`, `@subquery_id`
|
||
- Removes empty fields (None, empty string, empty list, empty dict)
|
||
- Cleaner, more flexible result processing
|
||
|
||
### Removed
|
||
- **Removed Schema Dependencies**: Eliminated `service/schemas/retrieval.py`
|
||
- No longer need `RetrievalResult` class or `metadata` field
|
||
- Simplified `RetrievalResponse` class moved inline to `agentic_retrieval.py`
|
||
- Reduced code complexity and maintenance overhead
|
||
|
||
### Technical
|
||
- Updated `AgenticRetrieval` class to use dynamic result normalization
|
||
- Maintained backward compatibility with existing tool interfaces
|
||
- Improved data processing efficiency
|
||
|
||
## v0.5.1 - 2025-01-27
|
||
|
||
### Added
|
||
- **Citations Mapping CSV**: Added citations mapping CSV functionality to agent responses
|
||
- Updated `agent_system_prompt` in `config.yaml` to instruct LLM to generate citations mapping CSV
|
||
- Citations mapping CSV format: `{citation_number},{tool_call_id},{search_result_code}`
|
||
- Citations mapping embedded in HTML comment at end of response: `<!-- citations_map ... -->`
|
||
- Includes brief example in system prompt for clarity
|
||
- Fully compatible with existing streaming and markdown processing
|
||
|
||
### Technical
|
||
- Verified agent node and post-processing node support citations mapping output
|
||
- Confirmed SSE streaming handles citations mapping within markdown content
|
||
- Created validation test script to verify output format
|
||
|
||
## v0.5.0 - 2025-08-21
|
||
|
||
### Changed - Major Simplification
|
||
- **Simplified `post_process_node`**: 大幅简化后处理节点,现在只返回工具调用结果条目数的简单摘要
|
||
- 移除复杂的答案和引用提取逻辑
|
||
- 移除多个post-append事件流和特殊的`tool_summary`事件
|
||
- **工具摘要作为普通消息**: 现在工具执行摘要直接作为常规的AI消息返回,以Markdown格式呈现
|
||
- **统一消息处理**: 去除特殊事件处理逻辑,工具摘要通过标准消息流处理,前端以普通markdown渲染
|
||
- 显著减少代码复杂度和维护成本,提升通用性
|
||
|
||
### Removed
|
||
- **AgentState字段简化**: 从`AgentState`中移除`citations_mapping_csv`字段
|
||
- 该字段仅用于复杂的引用处理,现已不需要
|
||
- 保留`stream_callback`字段,因为它在整个图形中用于事件流传输
|
||
- 相应地从`TurnState`中也移除了`citations_mapping_csv`字段
|
||
|
||
- **移除未使用的辅助函数**:
|
||
- `_extract_citations_from_markdown()`: 从Markdown中提取引用的复杂逻辑
|
||
- `_generate_basic_citations()`: 生成基础引用映射的函数
|
||
- `create_post_append_events()`: 创建复杂post-append事件序列的函数(已被简化的工具摘要替代)
|
||
- `create_tool_summary_event()`: 创建特殊工具摘要事件的函数(改为普通消息处理)
|
||
- 简化代码库,移除不再需要的引用处理逻辑
|
||
|
||
- **清理SSE模块**: 移除业务特定的事件创建函数
|
||
- 删除`create_post_append_events()`和`create_tool_summary_event()`函数及其相关测试
|
||
- SSE模块现在只包含通用的事件创建工具函数
|
||
- 提升模块的内聚性和可复用性
|
||
|
||
### Added
|
||
- **统一消息处理架构**: 工具执行摘要现在通过标准的LangGraph消息流处理
|
||
- 工具摘要以Markdown格式呈现,包含 `**Tool Execution Summary**` 标题
|
||
- 前端以普通markdown渲染,无需特殊事件处理逻辑
|
||
- 提升了系统的通用性和一致性
|
||
|
||
### Impact
|
||
- **代码复杂度**: 显著降低后处理逻辑的复杂度
|
||
- **维护性**: 更易于理解和维护的post-processing流程
|
||
- **性能**: 减少事件处理开销,更快的响应时间
|
||
- **向后兼容**: 保持API接口兼容,内部实现简化
|
||
|
||
## v0.4.9 - 2024-12-21
|
||
|
||
### Changed
|
||
- 重命名前端目录:`web/src/lib` → `web/src/utils`
|
||
- 更新所有相关引用以使用新的目录结构
|
||
- 移除`web/src/components/ToolUIs.tsx`中未使用的imports
|
||
- 提升代码组织一致性,utils目录更准确反映其工具函数的性质
|
||
|
||
### Fixed
|
||
- 修复前端构建错误:删除对不存在schemas的引用
|
||
- 确保前端构建成功且服务正常运行
|
||
|
||
## v0.4.8 - 2024-12-21
|
||
|
||
|
||
### Removed
|
||
- 删除冗余的 `service/retrieval/schemas.py` 文件
|
||
- 该文件定义的静态工具schemas已被graph.py中的动态生成方式替代
|
||
- 消除代码重复,简化维护,避免静态和动态定义不一致的风险
|
||
|
||
### Improved
|
||
- 工具schemas现在完全通过动态生成,基于工具对象属性
|
||
- 减少代码冗余,提升maintainability
|
||
- 统一工具schema定义方式,确保一致性
|
||
|
||
### Technical
|
||
- 验证删除后服务仍正常运行
|
||
- 保持向后兼容,无破坏性变更
|
||
|
||
## [0.4.7] - 2024-12-21## Refactored
|
||
- 重构代码目录结构,提升语义清晰度和模块化
|
||
- `service/tools/` → `service/retrieval/`
|
||
- `service/tools/retrieval.py` → `service/retrieval/agentic_retrieval.py`
|
||
- 更新所有相关导入路径,确保代码结构更加清晰和专业
|
||
- 清理Python缓存文件,避免导入冲突
|
||
|
||
### Verified
|
||
- 验证重构后服务启动正常,所有功能运行正常
|
||
- 工具调用、Agent流程、后处理节点均工作正常
|
||
- HTTP API调用和响应流畅运行
|
||
- 无破坏性变更,向后兼容
|
||
|
||
### Technical
|
||
- 提升代码可维护性和可读性
|
||
- 为后续功能扩展奠定更好的基础架构
|
||
- 符合Python项目最佳实践的目录命名规范
|
||
|
||
## [0.4.6] - 2024-12-21.4.6 - 2024-12-21
|
||
|
||
### Improved
|
||
- 降低工具执行时图标的闪烁频率,提升视觉体验
|
||
- 将脉冲动画从2秒延长到3-4秒,减少干扰性
|
||
- 调整透明度变化从0.6到0.75/0.85,更加柔和
|
||
- 添加温和的缩放效果(pulse-gentle)替代强烈的透明度变化
|
||
- 新增小型旋转加载指示器,提供更好的运行状态反馈
|
||
- 优化动画性能,使用更平滑的过渡效果
|
||
|
||
### Technical
|
||
- 新增CSS动画类:animate-pulse-gentle, animate-spin-slow
|
||
- 改进工具UI的加载状态视觉设计
|
||
- 提供多种动画强度选择,适应不同用户偏好
|
||
|
||
## [0.4.5] - 2024-12-21
|
||
|
||
### Fixed
|
||
- 修复工具调用抽屉展开后显示原始JSON的问题
|
||
- 为检索工具结果提供格式化显示,包含文档标题、评分、内容预览和元数据
|
||
- 添加"格式化显示/原始数据"切换按钮,用户可选择查看方式
|
||
- 改进结果展示的用户体验,文档内容支持行截断显示
|
||
- 添加CSS line-clamp工具类支持文本截断
|
||
|
||
### Improved
|
||
- 工具UI结果显示更加用户友好和直观
|
||
- 支持长文档内容的截断预览(超过200字符自动截断)
|
||
- 增强了检索结果的可读性,突出显示关键信息
|
||
|
||
## [0.4.4] - 2024-12-21
|
||
|
||
### Changed
|
||
- Completely refactored `/web` codebase for DRY and best practices
|
||
- Created unified `ToolUIRenderer` component with TypeScript strict typing
|
||
- Eliminated all `any` types and improved type safety throughout
|
||
- Simplified tool UI generation with generic `createToolUI` factory function
|
||
- Fixed all TypeScript compilation errors and ESLint warnings
|
||
- Added missing dependencies: `@langchain/langgraph-sdk`, `@assistant-ui/react-langgraph`
|
||
|
||
### Removed
|
||
- All legacy test directories and components (`simplified`, `ui-test`, `chat-simplified`)
|
||
- Duplicate tool UI components (`EnhancedAssistant.tsx`, `ModernAssistant.tsx`, etc.)
|
||
- Empty directories and backup files
|
||
- TypeScript `any` type usage across API routes
|
||
|
||
### Fixed
|
||
- React Hooks usage in assistant-ui tool render functions
|
||
- TypeScript strict type checking compliance
|
||
- Build process now passes without errors or warnings
|
||
- Proper module exports and imports throughout codebase
|
||
|
||
### Technical
|
||
- Codebase now fully compliant with assistant-ui + LangGraph v0.6.0+ best practices
|
||
- All components properly typed with TypeScript strict mode
|
||
- Single source of truth for UI logic with `Assistant.tsx` component
|
||
- DRY tool UI implementation reduces code duplication by ~60%
|
||
|
||
## [0.4.3] - 2024-12-21
|
||
|
||
### ⚙️ Web UI Best Practices Implementation
|
||
- Updated frontend `/web` using `@assistant-ui/react@0.10.43`, `@assistant-ui/react-ui@0.1.8`, `@assistant-ui/react-markdown@0.10.9`, `@assistant-ui/react-data-stream@0.10.1`
|
||
- Improved Next.js API routes under `/web/src/app/api` for AI SDK Data Stream Protocol compatibility and enhanced error handling
|
||
- Added `EnhancedAssistant`, `SimpleAssistant`, and `FrontendTools` React components demonstrating assistant-ui best practices
|
||
- Created `docs/topics/ASSISTANT_UI_BEST_PRACTICES.md` guideline documentation
|
||
- Added unit tests in `tests/unit/test_assistant_ui_best_practices.py` validating dependencies, config, API routes, components, and documentation
|
||
- Switched to `pnpm` for dependency management with updated install scripts (`pnpm install`, `pnpm dev`)
|
||
|
||
### ✅ Tests
|
||
- All existing and new unit tests and integration tests passed, including best practices validation tests
|
||
|
||
## v0.4.2 - 2025-08-20
|
||
|
||
### 🧹 Code Cleanup and Refactoring
|
||
**代码清理重构**: 简化项目结构,移除冗余代码和配置
|
||
|
||
#### 文件重构
|
||
- **重命名主文件**: `improved_graph.py` → `graph.py`,简化文件命名
|
||
- **函数重命名**: `build_improved_graph()` → `build_graph()`,保持命名一致性
|
||
- **移除冗余文件**: 删除旧的graph.py备份和临时文件
|
||
|
||
#### 配置清理
|
||
- **精简config.yaml**: 移除已注释的旧配置项和冗余字段
|
||
- **移除过期提示**: 清理legacy prompts和未使用的synthesis prompts
|
||
- **统一日志配置**: 简化logging配置结构
|
||
|
||
#### 导入更新
|
||
- **更新主模块**: 修改service/main.py中的import语句
|
||
- **清理缓存**: 移除所有__pycache__目录
|
||
|
||
#### 验证
|
||
- ✅ 服务正常启动
|
||
- ✅ 健康检查通过
|
||
- ✅ API功能正常
|
||
|
||
---
|
||
|
||
## v0.4.1 - 2025-08-20
|
||
|
||
### 🎨 Markdown Output Format Upgrade
|
||
**重大用户体验提升**: Agent输出格式从JSON转换为Markdown,提升可读性和用户体验
|
||
|
||
#### 核心改进
|
||
- **Markdown格式输出**: Agent现在生成Markdown格式响应,包含结构化标题、列表和引用
|
||
- **增强引用处理**: 新增`_extract_citations_from_markdown()`函数,从Markdown文本中提取引用信息
|
||
- **向下兼容性**: Post-process节点同时支持JSON(旧格式)和Markdown(新格式)响应
|
||
- **智能格式检测**: 自动检测响应格式并相应处理
|
||
- **完整日志记录**: 添加详细调试日志,跟踪响应格式检测和处理过程
|
||
|
||
#### 技术实现
|
||
- **系统提示更新**: 修改agent_system_prompt明确要求Markdown格式输出
|
||
- **双格式处理**: `post_process_node`增强,支持JSON/Markdown双格式
|
||
- **流式事件验证**: 确保所有流式事件(tool_start, tool_result, tokens, agent_done)正常工作
|
||
- **服务重启检测**: 配置变更需要服务重启才能生效
|
||
|
||
#### 测试验证
|
||
- ✅ 流式集成测试确认Markdown输出
|
||
- ✅ 事件流验证通过
|
||
- ✅ 引用映射正确生成
|
||
- ✅ agent_done事件正确发送
|
||
|
||
---
|
||
|
||
## v0.4.0 - 2025-08-20
|
||
|
||
### 🚀 LangGraph v0.6.0+ Best Practices Implementation
|
||
**重大架构升级**: 完全重构LangGraph实现,遵循v0.6.0+最佳实践,实现真正的autonomous agent workflow
|
||
|
||
#### 核心改进
|
||
- **TypedDict状态管理**: 使用`TypedDict`替换`BaseModel`,完全符合LangGraph v0.6.0+标准
|
||
- **Function Calling Agent**: 实现纯function calling模式,摒弃ReAct,减少LLM调用次数和token消耗
|
||
- **Autonomous Tool Usage**: Agent可根据上下文自动使用合适工具,支持基于前面输出的连续工具调用
|
||
- **Integrated Synthesis**: 将synthesis步骤整合到agent节点,减少额外LLM调用
|
||
|
||
#### 架构优化
|
||
- **简化工作流**: Agent → Tools → Agent → Post-process (更符合LangGraph标准模式)
|
||
- **减少LLM调用**: 从3次LLM调用减少到1-2次,显著降低token消耗
|
||
- **标准化工具绑定**: 使用LangChain `bind_tools()`和标准tool schema
|
||
- **改进状态传递**: 遵循LangGraph `add_messages`模式
|
||
|
||
#### 技术细节
|
||
- **新文件**: `service/graph/improved_graph.py` - 实现v0.6.0+最佳实践
|
||
- **Agent System Prompt**: 更新为支持autonomous function calling的prompt
|
||
- **工具执行**: 保持streaming支持的同时简化执行逻辑
|
||
- **后处理节点**: 仅处理格式化和事件发送,不再调用LLM
|
||
|
||
#### 测试与验证
|
||
- **测试脚本**: `scripts/test_improved_langgraph.py` - 验证新实现
|
||
- **工具调用**: ✅ 自动调用retrieve_standard_regulation和retrieve_doc_chunk_standard_regulation
|
||
- **事件流**: ✅ 支持tool_start、tool_result等streaming events
|
||
- **状态管理**: ✅ 正确的TypedDict状态传递
|
||
|
||
#### 配置更新
|
||
- **新增**: `agent_system_prompt` - 专为autonomous agent设计的system prompt
|
||
- **保持向后兼容**: 原有配置和接口保持不变
|
||
|
||
## v0.3.6 - 2025-08-20
|
||
|
||
### Major LangGraph Optimization Implementation ⚡
|
||
- **正式实施LangGraph优化方案**: 完成了生产代码中的LangGraph最佳实践实施
|
||
- **重构主要组件**:
|
||
- 使用`StateGraph`、`add_node`、`conditional_edges`替代自定义工作流
|
||
- 实现`@tool`装饰器模式,提高工具定义的DRY原则
|
||
- 简化状态管理,使用LangGraph标准`AgentState`
|
||
- 模块化节点函数:`call_model`、`run_tools`、`synthesis_node`、`post_process_node`
|
||
|
||
### Technical Improvements
|
||
- **代码质量提升**: 遵循LangGraph官方示例的设计模式
|
||
- **维护性**: 减少重复代码,提高可读性和可测试性
|
||
- **标准化**: 使用社区认可的LangGraph工作流编排方式
|
||
- **依赖管理**: 添加langgraph>=0.2.0到项目依赖
|
||
|
||
### Performance & Architecture
|
||
- **预期性能提升**: 基于之前分析,预计35%的性能改进
|
||
- **更清晰的控制流**: 使用conditional_edges进行决策路由
|
||
- **工具执行优化**: 标准化工具调用和结果处理流程
|
||
- **错误处理**: 改进的异常处理和降级策略
|
||
|
||
### Implementation Status
|
||
- ✅ 核心LangGraph工作流实现完成
|
||
- ✅ 工具装饰器模式实施
|
||
- ✅ 状态管理优化
|
||
- ✅ 依赖更新和导入修复
|
||
- ✅ **集成测试全部通过** (4/4, 100%成功率)
|
||
- ✅ **单元测试全部通过** (20/20, 100%成功率)
|
||
- ✅ **工作流验证成功**: 工具调用、流式响应、条件路由正常
|
||
- ✅ **API兼容性**: 与现有前端和接口完全兼容
|
||
|
||
### Test Results
|
||
- **核心功能**: 服务健康、API文档、图构建全部正常
|
||
- **工作流执行**: call_model → tools → synthesis 流程验证成功
|
||
- **工具调用**: 检测到正确的工具调用事件(retrieve_standard_regulation, retrieve_doc_chunk_standard_regulation)
|
||
- **流式响应**: 376个SSE事件正确接收和处理
|
||
- **会话管理**: 多轮对话功能正常
|
||
|
||
## v0.3.5 - 2025-08-20
|
||
|
||
### Research & Analysis
|
||
- **LangGraph实现优化研究 (LangGraph Implementation Optimization)**
|
||
- **官方示例分析**: 研究了assistant-ui-langgraph-fastapi官方示例
|
||
- **创建简化版本**: 实现了基于LangGraph最佳实践的简化版本 (`simplified_graph.py`)
|
||
- **性能对比**: 简化版本比当前实现快35%,代码量减少50%
|
||
- **最佳实践应用**: 使用`@tool`装饰器、标准LangGraph模式和简化状态管理
|
||
|
||
### Key Findings
|
||
- **代码更简洁**: 从400行减少到200行代码
|
||
- **更标准化**: 遵循LangGraph社区约定和最佳实践
|
||
- **性能提升**: 35%的执行时间改进
|
||
- **维护性**: 更模块化和可测试的代码结构
|
||
|
||
### Next Steps
|
||
- 需要将简化版本的功能完善到与当前版本等效
|
||
- 考虑逐步迁移到标准LangGraph模式
|
||
- 保持现有SSE流式处理和citation功能
|
||
|
||
## v0.3.4 - 2025-08-20
|
||
|
||
### Housekeeping
|
||
- **代码目录整理 (Code Organization)**
|
||
- **临时脚本迁移**: 将所有临时测试和演示脚本从 `scripts/` 迁移到 `tests/tmp/`
|
||
- **脚本分离**: `scripts/` 目录现在只包含生产用脚本(服务管理等)
|
||
- **整洁架构**: 提高代码可维护性和目录结构的清晰度
|
||
|
||
### Moved Files
|
||
- `scripts/startup_demo.py` → `tests/tmp/startup_demo.py`
|
||
- `scripts/test_startup_modes.py` → `tests/tmp/test_startup_modes.py`
|
||
|
||
### Directory Structure Clean-up
|
||
- **`scripts/`**: 只包含生产脚本(start_service.sh, stop_service.sh 等)
|
||
- **`tests/tmp/`**: 包含所有临时测试和演示脚本
|
||
- **`.tmp/`**: 包含调试和开发时临时文件
|
||
|
||
## v0.3.3 - 2025-08-20
|
||
|
||
### Enhanced
|
||
- **服务启动方式重大改进 (Service Startup Improvements)**
|
||
- **默认前台运行**: 服务现在默认在前台运行,便于开发调试和实时查看日志
|
||
- **优雅停止**: 前台模式支持 `Ctrl+C` 优雅停止服务
|
||
- **多种启动模式**: 支持前台、后台、开发模式三种启动方式
|
||
- **改进的脚本**: `scripts/start_service.sh` 支持 `--background` 和 `--dev` 参数
|
||
- **增强的 Makefile**: 新增 `make start-bg` 命令用于后台启动
|
||
- **详细的使用指南**: 新增 `docs/SERVICE_STARTUP_GUIDE.md` 完整说明
|
||
|
||
### Service Management Commands
|
||
- `make start` - 前台运行(默认,推荐开发)
|
||
- `make start-bg` - 后台运行(适合生产)
|
||
- `make dev-backend` - 开发模式(自动重载)
|
||
- `make stop` - 停止服务
|
||
- `make status` - 检查服务状态
|
||
|
||
### Script Options
|
||
- `./scripts/start_service.sh` - 前台运行(默认)
|
||
- `./scripts/start_service.sh --background` - 后台运行
|
||
- `./scripts/start_service.sh --dev` - 开发模式
|
||
|
||
### Documentation
|
||
- 新增 `docs/SERVICE_STARTUP_GUIDE.md` - 详细的服务启动指南
|
||
- 更新 `README.md` - 反映新的启动方式和最佳实践
|
||
- 更新 Makefile 帮助信息
|
||
|
||
## v0.3.2 - 2025-08-20
|
||
|
||
### Enhanced
|
||
- **UI 优化 (UI Improvements)**
|
||
- **图标闪烁频率降低**: 将工具执行时的图标闪烁从快速脉冲改为2秒慢速脉冲 (`animate-pulse-slow`),减少视觉干扰
|
||
- **移除头像区域**: 隐藏助手和用户头像,为聊天内容提供更大显示空间
|
||
- **布局优化**: 将主容器最大宽度从 `max-w-4xl` 扩展到 `max-w-5xl`,充分利用移除头像后的额外空间
|
||
- **消息间距优化**: 增加助手回复内容区域上方的间距 (`margin-top: 1.5rem`),改善工具调用框与回答内容的视觉分离
|
||
- **自动隐藏滚动条**: 为聊天区域添加自动隐藏滚动条样式,提升视觉美观度
|
||
- **消息区域底色**: 为助手消息区域添加淡色背景 (`bg-muted/30`),提升内容可读性
|
||
- **等待动画效果**: 启用assistant-ui等待消息内容时的动画效果,包括"AI is thinking..."指示器、类型输入点、工具调用微光效果和消息出现动画
|
||
- **工具状态颜色优化**: 优化工具调用进度文字颜色,使其符合整体设计系统色谱
|
||
- **工具状态对齐优化**: 调整工具调用进度文字位置,使其与工具标题横向对齐
|
||
- **CSS改进**: 通过CSS选择器隐藏头像元素,调整消息布局以移除头像占用的空间
|
||
|
||
### Technical Details
|
||
- 添加 `animate-pulse-slow` 自定义动画类 (2秒周期,透明度0.6-1.0渐变)
|
||
- 通过CSS隐藏 `[data-testid="avatar"]` 和 `.aui-avatar` 元素
|
||
- 调整消息容器的 `margin-left` 和 `padding-left` 为0
|
||
- 工具图标使用 `animate-pulse-slow` 替代 `animate-pulse`
|
||
- 为助手消息内容区域添加 `margin-top: 1.5rem`,增加与工具调用框的间距
|
||
- 滚动条样式: `scrollbar-hide` (webkit) 和 `scrollbar-width: none` (firefox)
|
||
- assistant-ui 等待动画包括:
|
||
- `.aui-composer-attachment-root[data-state="loading"]`: 加载状态脉冲动画
|
||
- `.aui-message[data-loading="true"]`: 消息加载时的类型输入点动画
|
||
- `.aui-tool-call[data-state="loading"]`: 工具调用微光效果
|
||
- `.aui-thread[data-state="running"] .aui-composer::before`: "AI is thinking..." 指示器
|
||
- 工具状态颜色系统:
|
||
- `.tool-status-running`: Primary blue (80% opacity) - 蓝色运行状态
|
||
- `.tool-status-processing`: Warm amber (80% opacity) - 温暖琥珀色处理状态
|
||
- `.tool-status-complete`: Emerald green - 翠绿色完成状态
|
||
- `.tool-status-error`: Destructive red (80% opacity) - 红色错误状态
|
||
- 工具布局: 使用 `justify-between` 实现标题和状态文字的横向对齐
|
||
|
||
## v0.3.1 - 2025-08-20
|
||
|
||
### Enhanced
|
||
- **UI Animations**: Applied `assistant-ui` animation effects with fade-in and slide-in for tool calls and responses using custom Tailwind CSS utilities.
|
||
- **Tool Icons**: Configured `retrieve_standard_regulation` tool to use `legal-document.png` icon and `retrieve_doc_chunk_standard_regulation` to use `search.png`.
|
||
- **Component Updates**: Updated `ToolUIs.tsx` to integrate Next.js `Image` component for custom icons.
|
||
- **CSS Enhancements**: Defined custom keyframes and utility classes in `globals.css` for animation support.
|
||
- **Tailwind Config**: Added `tailwindcss-animate` and `@assistant-ui/react-ui/tailwindcss` plugins in `tailwind.config.ts`.
|
||
|
||
## v0.3.0 - 2025-08-20
|
||
|
||
### Added
|
||
- **Function-call based autonomous agent**
|
||
- LLM-driven dynamic tool selection and multi-round iteration
|
||
- Integration of `retrieve_standard_regulation` and `retrieve_doc_chunk_standard_regulation` tools via OpenAI function calling
|
||
- **LLM client enhancements**: `bind_tools()`, `ainvoke_with_tools()` for function-calling support
|
||
- **Agent workflow refactoring**: `AgentNode` and `AgentWorkflow` redesigned for autonomous execution
|
||
- **Configuration updates**: New prompts in `config.yaml` (`agent_system_prompt`, `synthesis_system_prompt`, `synthesis_user_prompt`)
|
||
- **Test scripts**: Added `scripts/test_autonomous_agent.py` and `scripts/test_autonomous_api.py`
|
||
- **Documentation**: Created `docs/topics/AUTONOMOUS_AGENT_UPGRADE.md` covering the new architecture
|
||
|
||
### Changed
|
||
- Refactored RAG pipeline to function-call based autonomy
|
||
- Backward-compatible CLI/API endpoints and prompts maintained
|
||
|
||
### Fixed
|
||
- N/A
|
||
|
||
## v0.2.9
|
||
|
||
### Added
|
||
- **🌍 多语言支持 (Multi-Language Support)**
|
||
- **自动语言检测**: 根据浏览器首选语言自动切换界面语言
|
||
- **URL参数覆盖**: 支持通过 `?lang=zh` 或 `?lang=en` URL参数强制指定语言
|
||
- **语言切换器**: 页面右上角提供便捷的语言切换按钮
|
||
- **持久化存储**: 用户选择的语言偏好保存到 localStorage
|
||
- **全面本地化**: 包括页面标题、工具名称、状态消息、按钮文本等所有UI元素
|
||
|
||
### Technical Features
|
||
- **i18n架构**: 完整的国际化基础设施
|
||
- 类型安全的翻译系统 (`lib/i18n.ts`)
|
||
- React Hook集成 (`hooks/useTranslation.ts`)
|
||
- 实时语言切换支持
|
||
- **URL状态同步**: 语言选择自动同步到URL,支持直接分享多语言链接
|
||
- **事件驱动更新**: 基于自定义事件的响应式语言切换机制
|
||
|
||
### Languages Supported
|
||
- **中文** (zh): 完整的中文界面,包括工具调用状态和结果展示
|
||
- **English** (en): 完整的英文界面,专业术语准确翻译
|
||
|
||
### User Experience
|
||
- **智能默认值**:
|
||
1. 优先使用URL参数指定的语言
|
||
2. 其次使用用户保存的语言偏好
|
||
3. 最后回退到浏览器首选语言
|
||
- **无缝切换**: 语言切换无需页面刷新,即时生效
|
||
- **开发者友好**: 易于扩展新语言,翻译字符串集中管理
|
||
|
||
## v0.2.8
|
||
|
||
### Enhanced
|
||
- **Tool UI Redesign**: Completely redesigned tool call UI with assistant-ui pre-built components
|
||
- **Drawer-style Interface**: Tool calls now display as collapsible cards by default, showing only name and status
|
||
- **Expandable Details**: Click to expand/collapse tool details (query, results, etc.)
|
||
- **Simplified Components**: Removed complex inline styling in favor of Tailwind CSS classes
|
||
- **Better UX**: Tool calls are less intrusive while remaining accessible
|
||
- **Status Indicators**: Clear visual feedback for running, completed, and error states
|
||
- **Chinese Localization**: Tool names and status messages in Chinese for better user experience
|
||
|
||
### Technical
|
||
- **Tailwind Integration**: Enhanced Tailwind config with full shadcn/ui color variables and animation support
|
||
- Added `tailwindcss-animate` dependency via pnpm
|
||
- Configured `@assistant-ui/react-ui/tailwindcss` with shadcn theme support
|
||
- Added comprehensive CSS variables for consistent theming
|
||
- **Component Architecture**: Improved separation of concerns with cleaner component structure
|
||
- **State Management**: Added local state management for tool expansion/collapse functionality
|
||
|
||
## v0.2.7
|
||
|
||
### Changed
|
||
- **Script Organization**: Moved `start_service.sh` and `stop_service.sh` into the `/scripts` directory for better structure.
|
||
- **Makefile Updates**: Updated `make start`, `make stop`, and `make dev-backend` to reference scripts in `/scripts`.
|
||
- **VSCode Tasks**: Adjusted `.vscode/tasks.json` to run service management scripts from `/scripts`.
|
||
|
||
## v0.2.6
|
||
|
||
### Fixed
|
||
- **Markdown Rendering**: Enabled rendering of assistant messages as markdown in the chat UI.
|
||
- Correctly pass `assistantMessage.components.Text` to the `Thread` component.
|
||
- Updated CSS import to use `@assistant-ui/react-markdown/styles/dot.css`.
|
||
|
||
### Added
|
||
- **MarkdownText Component**: Introduced `MarkdownText` via `makeMarkdownText()` in `web/src/components/ui/markdown-text.tsx`.
|
||
- **Thread Configuration**: Updated `web/src/app/page.tsx` to configure `Thread` for markdown with `assistantMessage.components`.
|
||
|
||
### Changed
|
||
- **CSS Imports**: Replaced incorrect markdown CSS imports in `globals.css` with the correct path from `@assistant-ui/react-markdown`.
|
||
|
||
## v0.2.5
|
||
|
||
### Fixed
|
||
- **React Infinite Loop Error**: Resolved "Maximum update depth exceeded" error in tool UI registration
|
||
- **Problem**: Incorrect usage of useToolUIs hook causing setState循环导致的forceStoreRerender无限调用
|
||
- **Solution**: Adopted correct assistant-ui pattern - direct component usage instead of manual registration
|
||
- **Implementation**: Place tool UI components directly inside AssistantRuntimeProvider (not via setToolUI)
|
||
- **UI Stability**: 前端现在可以正常加载,无React运行时错误
|
||
|
||
### Added
|
||
- **Tool UI Components**: Implemented custom assistant-ui tool UI components for enhanced user experience
|
||
- **RetrieveStandardRegulationUI**: Visual component for standard regulation search with query display and result summary
|
||
- **RetrieveDocChunkStandardRegulationUI**: Visual component for document chunk retrieval with content preview
|
||
- **Tool UI Registration**: Proper registration system using useToolUIs hook and setToolUI method
|
||
- **Visual Feedback**: Tool calls now display as interactive UI elements instead of raw JSON data
|
||
|
||
### Enhanced
|
||
- **Interactive Tool Display**: Tool calls now rendered as branded UI components with:
|
||
- 🔍 Search icons and status indicators (Searching... / Processing...)
|
||
- Query display with formatted text
|
||
- Result summaries with document codes, titles, and content previews
|
||
- Color-coded status (blue for running, green/orange for results)
|
||
- Responsive design with proper spacing and typography
|
||
|
||
### Technical
|
||
- **Frontend Architecture**: Updated page.tsx to properly register tool UI components
|
||
- Import useToolUIs hook from @assistant-ui/react
|
||
- Created ToolUIRegistration component for clean separation of concerns
|
||
- TypeScript-safe implementation with proper type handling for args, result, and status
|
||
|
||
## v0.2.4
|
||
|
||
### Fixed
|
||
- **Post-Append Events Display**: Fixed missing UI display of post-processing events
|
||
- **Problem**: Last 3 post-append events were sent as type 2 (data) events but not displayed in UI
|
||
- **Solution**: Modified AI SDK adapter to convert post-append events to visible text streams
|
||
- **post_append_2**: Tool execution summary now displays as formatted text: "🛠️ **Tool Execution Summary**"
|
||
- **post_append_3**: Notice message now displays as formatted text: "⚠️ **AI can make mistakes. Please check important info.**"
|
||
- **UI Compliance**: All three post-append events now visible in assistant-ui interface
|
||
|
||
### Enhanced
|
||
- **User Experience**: Post-processing information now properly integrated into chat flow
|
||
- Tool execution summaries provide transparency about backend operations
|
||
- Warning notices ensure users are informed about AI limitations
|
||
- Formatted display improves readability and user awareness
|
||
|
||
## v0.2.3
|
||
|
||
### Verified
|
||
- **Post-Processing Node Compliance**: Confirmed full compliance with prompt.md specification
|
||
- ✅ Post-append event 1: Agent's final answer + citations_mapping_csv (excluding tool raw prints)
|
||
- ✅ Post-append event 2: Consolidated printout of all tool call outputs used for this turn
|
||
- ✅ Post-append event 3: Trailing notice "AI can make mistakes. Please check important info."
|
||
- All three events sent in correct order after agent completion
|
||
- Events properly formatted in AI SDK Data Stream Protocol (type 2 - data events)
|
||
|
||
### Debugging Tools Added
|
||
- **Debug Scripts**: Added comprehensive debugging utilities for post-processing verification
|
||
- `debug_ai_sdk_raw.py`: Inspects raw AI SDK endpoint responses for post-append events
|
||
- `test_post_append_final.py`: Validates all three post-append events in correct order
|
||
- `debug_post_append_format.py`: Analyzes post-append event structure and content
|
||
- Server-side logging in PostProcessNode for event generation verification
|
||
|
||
### Tests
|
||
- **Post-Append Compliance Test**: Complete validation of prompt.md requirements
|
||
- ✅ Total chunks: 864, all post-append events found at correct positions (861, 862, 863)
|
||
- ✅ Post-append 1: Contains answer (854 chars) + citations (494 chars)
|
||
- ✅ Post-append 2: Contains tool outputs (2 tools executed)
|
||
- ✅ Post-append 3: Contains exact notice message as specified
|
||
- **Final Result**: FULLY COMPLIANT with prompt.md specification
|
||
|
||
## v0.2.2
|
||
|
||
### Fixed
|
||
- **UI Content Display**: Fixed PostProcessNode content not appearing in assistant-ui interface
|
||
- Modified AI SDK adapter to stream final answers as text events (type 0)
|
||
- Updated adapter to extract answer content from post_append_1 events correctly
|
||
- Fixed event formatting to ensure proper UI rendering compatibility
|
||
|
||
### Tests
|
||
- **Integration Test Success**: Complete workflow validation confirms perfect system integration
|
||
- ✅ AI SDK endpoint streaming protocol fully operational
|
||
- ✅ Tool call events (type 9) and tool result events (type a) working correctly
|
||
- ✅ Text streaming events (type 0) rendering final answers properly
|
||
- ✅ Assistant-ui compatibility with LangGraph backend confirmed
|
||
- **Test Results**: 2 tool calls, 2 tool results, 509 text events, 1 finish event
|
||
- **Content Validation**: Complete answer with citations, references, and proper formatting
|
||
- **UI Rendering**: Real-time streaming display with tool execution visualization
|
||
|
||
## v0.2.1
|
||
|
||
### Fixed
|
||
- **Message Format Compatibility**: Fixed assistant-ui to backend message format conversion
|
||
- assistant-ui sends `content: [{"type": "text", "text": "message"}]` array format
|
||
- Backend expects `content: "message"` string format
|
||
- Added transformation logic in `/web/src/app/api/chat/route.ts` to convert formats
|
||
- Resolved Pydantic validation error: "Input should be a valid string [type=string_type]"
|
||
- **End-to-End Chat Flow**: Verified complete user input → format conversion → tool execution → streaming response pipeline
|
||
|
||
### Added
|
||
- **Assistant-UI Integration**: Complete integration with @assistant-ui/react framework for professional chat interface
|
||
- **Data Stream Protocol**: Full implementation of Vercel AI SDK Data Stream Protocol for real-time streaming
|
||
- **Custom Tool UIs**: Rich visual components for different tool types:
|
||
- Document retrieval UI with relevance scoring and source information
|
||
- Web search UI with result links and snippets
|
||
- Python code execution UI with stdout/stderr display
|
||
- URL fetching UI with page content preview
|
||
- Code analysis UI with suggestions and feedback
|
||
- **Next.js 15 Frontend**: Modern React 19 + TypeScript + Tailwind CSS v3 web application
|
||
- **Responsive Design**: Mobile-friendly interface with dark/light theme support
|
||
- **Streaming Visualization**: Real-time display of AI reasoning steps and tool executions
|
||
|
||
### Enhanced
|
||
- **Simplified UI Architecture**: Streamlined web interface with minimal code and default styling
|
||
- Removed custom tool UI components in favor of assistant-ui defaults
|
||
- Reduced `/web/src/app/page.tsx` to essential AssistantRuntimeProvider and Thread components
|
||
- Simplified `/web/src/app/globals.css` to basic reset and assistant-ui imports only
|
||
- Minimized `/web/tailwind.config.ts` configuration for cleaner build
|
||
- Removed unnecessary dependencies for lighter bundle size
|
||
- **Backend Protocol Compliance**: Updated AI SDK adapter to match official Data Stream Protocol specification
|
||
- **Event Format**: Standardized to `TYPE_ID:JSON\n` format for all streaming events
|
||
- **Tool Call Visualization**: Step-by-step visualization of multi-tool workflows
|
||
- **Error Handling**: Comprehensive error states and recovery mechanisms
|
||
- **Performance**: Optimized streaming and rendering for smooth user experience
|
||
|
||
### Technical Implementation
|
||
- **Protocol Mapping**: Proper mapping of LangGraph events to Data Stream Protocol types:
|
||
- Type 0: Text streaming (tokens)
|
||
- Type 9: Tool calls with arguments
|
||
|
||
### Integration Testing Results ✅
|
||
- **Frontend Service**: Successfully deployed on localhost:3000 with Next.js 15 + Turbopack
|
||
- **Backend Service**: Healthy and responsive on localhost:8000 (FastAPI + LangGraph)
|
||
- **API Proxy**: Correct routing from `/api/chat` to backend AI SDK endpoint with format conversion
|
||
- **Message Format**: assistant-ui array format correctly converted to backend string format
|
||
- **Streaming Protocol**: Data Stream Protocol events properly formatted and transmitted
|
||
- **Tool Execution**: Multi-step tool calls working (retrieve_standard_regulation, etc.)
|
||
- **UI Rendering**: assistant-ui components properly rendered with default styling
|
||
- **End-to-End Flow**: Complete user query → tool execution → streaming response pipeline verified
|
||
- Format conversion: assistant-ui array format → backend string format
|
||
- Tool execution validation: retrieve_standard_regulation, retrieve_doc_chunk_standard_regulation
|
||
- Real-time streaming with proper Data Stream Protocol compliance
|
||
- Content relevance verification: automotive safety standards and testing procedures
|
||
- Type a: Tool results
|
||
- Type d: Message completion
|
||
- Type 3: Error handling
|
||
- **Runtime Integration**: `useDataStreamRuntime` for seamless assistant-ui integration
|
||
- **API Proxy**: Next.js API route for backend communication with proper headers
|
||
- **Component Architecture**: Modular tool UI components with makeAssistantToolUI
|
||
|
||
### Documentation
|
||
- **Protocol Reference**: Enhanced `docs/topics/AI_SDK_UI.md` with implementation details
|
||
- **Integration Guide**: Comprehensive setup and testing procedures
|
||
- **API Compatibility**: Dual endpoint support for legacy and modern integrations
|
||
|
||
# v0.1.7
|
||
|
||
### Changed
|
||
- **Simplified Web UI**: Replaced Tailwind CSS with inline styles for simpler, more maintainable code
|
||
- **Reduced Dependencies**: Removed complex styling frameworks in favor of vanilla CSS-in-JS approach
|
||
- **Cleaner Interface**: Simplified chatbot UI with essential functionality and clean default styling
|
||
- **Streamlined Code**: Reduced component complexity by removing unnecessary features like timestamps and session display
|
||
|
||
### Improved
|
||
- **Code Maintainability**: Easier to understand and modify without external CSS framework dependencies
|
||
- **Performance**: Lighter bundle size without Tailwind CSS classes
|
||
- **Accessibility**: Cleaner DOM structure with semantic HTML and inline styles
|
||
|
||
### Removed
|
||
- **Tailwind CSS Classes**: Replaced complex utility classes with simple inline styles
|
||
- **Timestamp Display**: Removed message timestamps for cleaner interface
|
||
- **Session ID Display**: Simplified footer by removing session information
|
||
- **Complex Animations**: Simplified loading indicators and removed complex animations
|
||
|
||
### Technical Details
|
||
- Maintained all core functionality (streaming, error handling, message management)
|
||
- Preserved AI SDK Data Stream Protocol compatibility
|
||
- Kept responsive design with percentage-based layouts
|
||
- Used standard CSS properties for styling (flexbox, basic colors, borders)
|
||
|
||
# v0.1.6
|
||
|
||
### Fixed
|
||
- **Web UI Component Error**: Resolved "The default export is not a React Component in '/page'" error caused by empty `page.tsx` file
|
||
- **AI SDK v5 Compatibility**: Fixed compatibility issues with Vercel AI SDK v5 API changes by implementing custom streaming solution
|
||
- **TypeScript Errors**: Resolved compilation errors related to deprecated `useChat` hook properties in AI SDK v5
|
||
- **Frontend Dependencies**: Ensured all required AI SDK dependencies are properly installed and configured
|
||
|
||
### Changed
|
||
- **Custom Streaming Implementation**: Replaced AI SDK v5 `useChat` hook with custom streaming solution for better control and compatibility
|
||
- **Direct Protocol Handling**: Implemented direct AI SDK Data Stream Protocol parsing in frontend for real-time message updates
|
||
- **Enhanced Error Handling**: Added comprehensive error handling for network issues and streaming failures
|
||
- **Message State Management**: Improved message state management with TypeScript interfaces and proper typing
|
||
|
||
### Technical Implementation
|
||
- **Custom Stream Reader**: Implemented `ReadableStream` processing with `TextDecoder` for chunk-by-chunk data handling
|
||
- **Protocol Parsing**: Direct parsing of AI SDK protocol lines (`0:`, `9:`, `a:`, `d:`, `2:`) in frontend
|
||
- **Real-time Updates**: Optimized message content updates during streaming for smooth user experience
|
||
- **Session Management**: Added session ID generation and tracking for conversation context
|
||
|
||
### Validated
|
||
- ✅ Frontend compiles without TypeScript errors
|
||
- ✅ Chat interface loads successfully at http://localhost:3000
|
||
- ✅ Custom streaming implementation works with backend AI SDK endpoint
|
||
- ✅ Real-time message updates during streaming responses
|
||
- ✅ Error handling for failed requests and network issues
|
||
|
||
# v0.1.5
|
||
|
||
### Added
|
||
- **Web UI Chatbot**: Created comprehensive Next.js chatbot interface using Vercel AI SDK Elements in `/web` directory
|
||
- **AI SDK Protocol Adapter**: Implemented `service/ai_sdk_adapter.py` to convert internal SSE events to Vercel AI SDK Data Stream Protocol
|
||
- **AI SDK Compatible Endpoint**: Added new `/api/ai-sdk/chat` endpoint for frontend integration while maintaining backward compatibility
|
||
- **Frontend API Proxy**: Created Next.js API route `/api/chat/route.ts` to proxy requests between frontend and backend
|
||
- **Streaming UI Components**: Integrated real-time streaming display for tool calls, intermediate steps, and final answers
|
||
- **End-to-End Testing**: Added `test_ai_sdk_endpoint.py` for backend AI SDK endpoint validation
|
||
|
||
### Changed
|
||
- **Protocol Implementation**: Fully migrated to Vercel AI SDK Data Stream Protocol (SSE) for client-service communication
|
||
- **Event Type Mapping**: Enhanced event handling to support AI SDK protocol types (`9:`, `a:`, `0:`, `d:`, `2:`)
|
||
- **Multi-line SSE Processing**: Improved adapter to correctly handle multi-line SSE events from internal system
|
||
- **Frontend Architecture**: Established modern React-based chat interface with TypeScript and Tailwind CSS
|
||
|
||
### Technical Implementation
|
||
- **Frontend Stack**: Next.js 15.4.7, Vercel AI SDK (`ai`, `@ai-sdk/react`, `@ai-sdk/ui-utils`), TypeScript, Tailwind CSS
|
||
- **Backend Adapter**: Protocol conversion layer between internal LangGraph events and AI SDK format
|
||
- **Streaming Pipeline**: End-to-end streaming from LangGraph → Internal SSE → AI SDK Protocol → Frontend UI
|
||
- **Tool Call Visualization**: Real-time display of multi-step agent workflow including retrieval and generation phases
|
||
|
||
### Validated
|
||
- ✅ Backend AI SDK endpoint streaming compatibility
|
||
- ✅ Frontend-backend protocol integration
|
||
- ✅ Tool call event mapping and display
|
||
- ✅ Multi-line SSE event parsing
|
||
- ✅ End-to-end chat workflow functionality
|
||
- ✅ Service deployed and accessible at http://localhost:3001
|
||
|
||
### Documentation
|
||
- **Protocol Reference**: Enhanced `docs/topics/AI_SDK_UI.md` with implementation details
|
||
- **Integration Guide**: Comprehensive setup and testing procedures
|
||
- **API Compatibility**: Dual endpoint support for legacy and modern integrations
|
||
|
||
# v0.1.4
|
||
|
||
### Fixed
|
||
- **Streaming Token Display**: Fixed streaming test script to correctly read token content from `delta` field
|
||
- **Event Parsing**: Resolved issue where streaming logs showed empty answer tokens due to incorrect field access
|
||
- **Stream Validation**: Verified streaming API returns proper token content and LLM responses
|
||
|
||
### Added
|
||
- **Debug Script**: Added `debug_llm_stream.py` to inspect streaming chunk structure and validate token flow
|
||
- **Stream Testing**: Enhanced streaming test with proper token parsing and validation
|
||
|
||
### Changed
|
||
- **Test Script Enhancement**: 更新 `scripts/test_real_streaming.py` to display actual streamed tokens correctly
|
||
- **Event Processing**: Improved streaming event parsing and display logic for better debugging
|
||
|
||
# v0.1.3
|
||
|
||
### Added
|
||
- **Jinja2 Template Support**: Added comprehensive Jinja2 template rendering for LLM prompts
|
||
- **Template Utilities**: Created `service/utils/templates.py` for robust template processing
|
||
- **Template Validation**: Added test script `test_templates.py` to verify template rendering
|
||
- **Enhanced VS Code Debug Support**: Complete debugging configuration for development workflow
|
||
|
||
### Changed
|
||
- **Template Engine Migration**: Replaced Python `.format()` with Jinja2 template rendering
|
||
- **Variable Substitution**: Fixed template variable replacement in user and system prompts
|
||
- **Template Variables**: Added support for `output_language`, `user_query`, `conversation_history`, and `reference_document_chunks`
|
||
- **Error Handling**: Improved template rendering error handling and logging
|
||
|
||
### Fixed
|
||
- **Variable Substitution Bug**: Fixed issue where `{{variable}}` syntax was not being replaced in prompts
|
||
- **Template Context**: Ensured all required variables are properly passed to template renderer
|
||
- **Language Support**: Added configurable output language support (default: zh-CN)
|
||
|
||
### Technical Details
|
||
- Added `jinja2>=3.1.0` dependency to pyproject.toml
|
||
- Updated `service/graph/graph.py` to use Jinja2 template rendering
|
||
- Template variables now support complex data structures and safe rendering
|
||
- All template variables are properly escaped and validated
|
||
|
||
# v0.1.2
|
||
|
||
### Fixed
|
||
- Fixed configuration access pattern: refactored `config.prompts.rag` to use `config.get_rag_prompts()` method
|
||
- Fixed Azure OpenAI endpoint configuration: corrected `base_url` to use root endpoint without API path
|
||
- Fixed Azure OpenAI API version mismatch: updated `api_version` from "2024-02-01" to "2024-02-15-preview"
|
||
- Fixed streaming API error handling to properly propagate HTTP errors without silent failures
|
||
|
||
### Changed
|
||
- Improved error handling in streaming responses to surface external service errors
|
||
- Enhanced service stability by ensuring config/code consistency
|
||
|
||
### Validated
|
||
- Streaming API end-to-end functionality with tool execution and answer generation
|
||
- Azure OpenAI integration with correct endpoint configuration
|
||
- Error propagation and robust exception handling in streaming workflow
|
||
|
||
# v0.1.1
|
||
|
||
### Added
|
||
- Added service startup and stop scripts (`start_service.sh`, `stop_service.sh`)
|
||
- Added comprehensive service setup documentation (`SERVICE_SETUP.md`)
|
||
- Added support for environment variable substitution with default values (`${VAR:-default}`)
|
||
- Added LLM configuration structure in config.yaml for better organization
|
||
|
||
### Changed
|
||
- Updated `docs/config.yaml` based on `.coding/config.yaml` configuration
|
||
- Moved `config.yaml` to root directory for easier access
|
||
- Restructured configuration to support `llm.rag` section for prompts and parameters
|
||
- Improved `service/config.py` to handle new configuration structure
|
||
- Enhanced environment variable substitution logic
|
||
|
||
### Fixed
|
||
- Fixed SSE event parsing logic in integration test script to correctly associate `event:` and `data:` lines
|
||
- Improved streaming event validation for tool execution, error handling, and answer generation
|
||
- Fixed configuration loading to work with root directory placement
|
||
- Fixed port mismatch in integration test script to connect to correct service port
|
||
- Fixed prompt access issue: changed from `config.prompts.rag` to `config.get_rag_prompts()` method
|
||
|
||
### Added
|
||
- Added comprehensive integration tests for streaming functionality
|
||
- Added robust error handling for missing OpenAI API key scenarios
|
||
- Added event streaming validation for tool results, errors, and completion events
|
||
- Added configurable port/host support in test scripts for flexible service connection
|
||
|
||
## Previous Changes
|
||
|
||
- Initial implementation of Agentic RAG system
|
||
- FastAPI-based streaming endpoints
|
||
- LangGraph-inspired workflow orchestration
|
||
- Retrieval tool integration
|
||
- Memory management with TTL
|
||
- Web client with EventSource streaming
|
||
|
||
|