update for 1. 优化 2.中英切换

This commit is contained in:
2026-06-10 11:10:36 +08:00
parent e7963b267e
commit 9212747e1b
42 changed files with 7866 additions and 278 deletions

View File

@@ -0,0 +1,459 @@
# Compliance Analysis Enhancement Design
**Date:** 2026-06-08
**Directions:** A (Analysis Quality) + B (History & Reports) + C (Deep Chat)
**Approach:** Three independent but coordinated feature sets sharing one DB schema (method one / structured tables).
---
## Goals
1. **A — Analysis Quality:** Parallel clause processing (3-5× speed), fix `highlight_terms` bug (always returns empty), add LLM retry with tenacity, reserve `PassThroughReranker` for future cross-encoder work.
2. **B — Analysis History & Reports:** Auto-save every completed analysis to PostgreSQL, history rail in UI, per-record DOCX export, delete with confirmation.
3. **C — Deep Chat:** Per-finding persistent chat threads grounded in real retrieved text, LLM-generated suggestion questions, multi-turn memory.
---
## Architecture Overview
### Layering Rules (must not be violated)
```
api/routes/ → thin HTTP handlers, SSE generators only
application/ → orchestration logic (pipeline.py)
domain/ports/ → ABCs, no implementation
infrastructure/ → DB, docx, external calls
shared/bootstrap.py → composition root, wires everything
```
New business logic goes in `application/compliance/pipeline.py` and domain ports. Never in `services/*` or `workflows/*`.
### Shared Database Schema (B + C)
Three tables, created together so C's FK references are valid from day one:
```sql
CREATE TABLE compliance_analyses (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_by VARCHAR(255),
doc_name VARCHAR(500),
standard_name VARCHAR(500),
risk_score INTEGER,
conclusion TEXT,
actions JSONB,
para_text TEXT,
highlight_terms JSONB
);
CREATE TABLE compliance_findings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
analysis_id UUID NOT NULL REFERENCES compliance_analyses(id) ON DELETE CASCADE,
seq INTEGER NOT NULL,
title VARCHAR(500),
description TEXT,
status VARCHAR(50),
clause_ref VARCHAR(200)
);
CREATE TABLE finding_chat_messages (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
analysis_id UUID NOT NULL REFERENCES compliance_analyses(id) ON DELETE CASCADE,
finding_id UUID NOT NULL REFERENCES compliance_findings(id) ON DELETE CASCADE,
role VARCHAR(20) NOT NULL, -- 'user' | 'assistant'
content TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
---
## Direction A — Analysis Quality
### A1: Parallel Clause Processing
**Current:** Route handler has a sequential `for i, clause in enumerate(clauses)` loop. Each iteration calls `retrieve_for_clause()` then `check_clause_compliance()` synchronously via `asyncio.to_thread`.
**Change:** Extract a `process_single_clause(clause, idx, ...) -> dict` function in `pipeline.py`, then replace the loop with `asyncio.gather`:
```python
async def run_clauses_parallel(clauses, retrieval_svc, llm_client, standard_name, para_text):
tasks = [
asyncio.to_thread(process_single_clause, clause, i, retrieval_svc, llm_client, standard_name, para_text)
for i, clause in enumerate(clauses)
]
return await asyncio.gather(*tasks, return_exceptions=True)
```
Results are yielded to the SSE stream in original order. Exceptions from individual clauses are caught and emitted as `{type: "error", clause_index: i}` events rather than crashing the whole stream.
### A2: Fix highlight_terms
**Root cause:** `synthesize_conclusion()` passes the LLM response through `json.loads()` but the LLM often wraps output in markdown fences (` ```json ... ``` `), causing a parse failure and silent fallback to `[]`.
**Fix in `pipeline.py`:**
```python
import re
def _extract_json(text: str) -> dict:
"""Strip markdown fences then parse JSON. Raises ValueError on failure."""
cleaned = re.sub(r"^```(?:json)?\s*|\s*```$", "", text.strip(), flags=re.MULTILINE)
return json.loads(cleaned)
```
Apply `_extract_json` in `synthesize_conclusion()` instead of bare `json.loads`. Wrap with `@retry` (see A3) so transient parse failures get a second attempt.
### A3: LLM Retry with tenacity
`tenacity` is already in `requirements.txt` but unused. Add to all LLM calls in `pipeline.py`:
```python
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=1, max=4),
retry=retry_if_exception_type((httpx.HTTPError, ValueError)),
reraise=True,
)
def _call_llm_with_retry(client, prompt: str) -> str:
"""Call LLM and return raw text. Retries on HTTP errors and JSON parse failures."""
...
```
On final failure, the calling function catches and emits `{type: "error", text: "LLM call failed after 3 attempts"}` to the SSE stream.
### A4: PassThroughReranker (future-ready stub)
`domain/retrieval/ports.py` already defines a `Reranker` ABC. Add the no-op implementation:
**New file:** `backend/app/infrastructure/retrieval/reranker.py`
```python
from app.domain.retrieval.ports import Reranker, RetrievedChunk
class PassThroughReranker(Reranker):
"""No-op reranker. Replace with CrossEncoderReranker when a local model is available."""
def rerank(self, query: str, chunks: list[RetrievedChunk], top_k: int) -> list[RetrievedChunk]:
return chunks[:top_k]
```
Register in `shared/bootstrap.py` as the default `Reranker` implementation.
### A — Files Changed
| File | Action |
|------|--------|
| `backend/app/application/compliance/pipeline.py` | Add `process_single_clause`, `run_clauses_parallel`, `_extract_json`, `_call_llm_with_retry` |
| `backend/app/api/routes/compliance.py` | Replace sequential loop with `await run_clauses_parallel(...)` |
| `backend/app/infrastructure/retrieval/reranker.py` | New — `PassThroughReranker` |
| `backend/app/shared/bootstrap.py` | Register `PassThroughReranker` |
---
## Direction B — History & Reports
### B1: Domain Port
**New file:** `backend/app/domain/compliance/ports.py`
```python
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional
@dataclass
class FindingRecord:
id: str
analysis_id: str
seq: int
title: str
description: str
status: str
clause_ref: Optional[str] = None
@dataclass
class AnalysisRecord:
id: str
created_at: datetime
created_by: Optional[str]
doc_name: str
standard_name: str
risk_score: int
conclusion: str
actions: list
para_text: str
highlight_terms: list
findings: list[FindingRecord] = field(default_factory=list)
class ComplianceRepository(ABC):
@abstractmethod
def save_analysis(self, record: AnalysisRecord) -> str: ...
@abstractmethod
def list_analyses(self, limit: int = 50, offset: int = 0) -> list[AnalysisRecord]: ...
@abstractmethod
def get_analysis(self, analysis_id: str) -> Optional[AnalysisRecord]: ...
@abstractmethod
def delete_analysis(self, analysis_id: str) -> None: ...
@abstractmethod
def save_message(self, analysis_id: str, finding_id: str, role: str, content: str) -> str: ...
@abstractmethod
def get_messages(self, finding_id: str) -> list[dict]: ...
```
### B2: PostgresComplianceRepository
**New file:** `backend/app/infrastructure/compliance/repository.py`
Implements `ComplianceRepository` using `psycopg2` (already in requirements). Connection string from `settings.DATABASE_URL`. Key methods:
- `save_analysis`: INSERT into `compliance_analyses`, then bulk INSERT findings into `compliance_findings`, return `analysis_id` (UUID string).
- `list_analyses`: SELECT with JOIN on findings count, ORDER BY `created_at DESC`, supports limit/offset.
- `get_analysis`: SELECT analysis + all findings by `analysis_id`.
- `delete_analysis`: DELETE cascades to findings and chat messages via FK.
- `save_message` / `get_messages`: INSERT/SELECT on `finding_chat_messages`.
Uses a connection pool (simple `psycopg2.pool.ThreadedConnectionPool`, min=1, max=5).
### B3: Auto-save Hook
In the SSE generator in `compliance.py`, after the `done` event is assembled:
```python
# After yielding the done event
if repo is not None:
record = AnalysisRecord(
id="", # will be assigned by DB
created_at=datetime.utcnow(),
created_by=current_user,
doc_name=doc_name,
standard_name=standard_name,
risk_score=done_payload["risk_score"],
conclusion=done_payload["conclusion"],
actions=done_payload["actions"],
para_text=done_payload["para_text"],
highlight_terms=done_payload["highlight_terms"],
findings=[FindingRecord(...) for f in accumulated_findings],
)
analysis_id = await asyncio.to_thread(repo.save_analysis, record)
# Emit an extra SSE event so frontend receives the analysis_id
yield f"data: {json.dumps({'type': 'saved', 'analysis_id': analysis_id})}\n\n"
```
### B4: New API Endpoints
Added to `backend/app/api/routes/compliance.py`:
```
GET /api/v1/compliance/history
Query params: limit=20&offset=0
Response: [{id, created_at, doc_name, standard_name, risk_score, finding_count}]
GET /api/v1/compliance/history/{analysis_id}
Response: full AnalysisRecord including findings list
DELETE /api/v1/compliance/history/{analysis_id}
Response: 204 No Content
GET /api/v1/compliance/history/{analysis_id}/download
Response: DOCX file (application/vnd.openxmlformats-officedocument.wordprocessingml.document)
```
### B5: DOCX Export
**New file:** `backend/app/infrastructure/compliance/docx_export.py`
Uses `python-docx` (already in requirements). Generates a structured report:
- Cover: document name, standard, date, risk score badge
- Executive summary: conclusion paragraph
- Findings table: seq / title / status / clause_ref / description
- Action items: numbered list
- Footer: generated by AI Regulation Analysis System
```python
def generate_docx(record: AnalysisRecord) -> bytes:
"""Generate a DOCX compliance report and return as bytes."""
doc = Document()
# ... build document ...
buf = BytesIO()
doc.save(buf)
return buf.getvalue()
```
### B6: Frontend — History Rail
`CompliancePage.tsx` gains a left rail (same layout pattern as RagChat's `history-pane`):
```
┌──────────────┬─────────────────────────────────┐
│ History │ Main Analysis Area │
│ ────────── │ │
│ 2026-06-08 │ (current analysis or loaded │
│ doc.pdf │ read-only historical record) │
│ ⚠ 72 [↓][×]│ │
│ ────────── │ │
│ 2026-06-07 │ │
│ csms.pdf │ │
│ ✓ 15 [↓][×]│ │
└──────────────┴─────────────────────────────────┘
```
- `[↓]` triggers `GET /history/{id}/download` and saves the DOCX file
- `[×]` shows a confirmation dialog, then calls `DELETE /history/{id}`
- Clicking a row loads that analysis into the main area in read-only mode
- `PageStateContext.ComplianceState` gains `analysisId: string | null` and `isReadOnly: boolean`
On mount, the rail calls `GET /history?limit=20` to populate the list. The list re-fetches after delete or after a new analysis completes (triggered by the `saved` SSE event).
### B — Files Changed
| File | Action |
|------|--------|
| `backend/app/domain/compliance/ports.py` | New — `ComplianceRepository` ABC + data classes |
| `backend/app/infrastructure/compliance/repository.py` | New — `PostgresComplianceRepository` |
| `backend/app/infrastructure/compliance/docx_export.py` | New — `generate_docx()` |
| `backend/app/api/routes/compliance.py` | Add history endpoints + auto-save hook |
| `backend/app/shared/bootstrap.py` | Register `PostgresComplianceRepository` |
| `frontend/src/pages/Compliance/CompliancePage.tsx` | Add History Rail |
| `frontend/src/contexts/PageStateContext.tsx` | Add `analysisId`, `isReadOnly` to `ComplianceState` |
---
## Direction C — Deep Chat
### C1: New Chat Endpoints
Replace the existing `/compliance/chat/{segment_id}` (kept for backward compatibility but deprecated) with finding-scoped endpoints:
```
POST /api/v1/compliance/analyses/{analysis_id}/findings/{finding_id}/chat
Body: {query: string}
Response: SSE stream — chunk / done / error events
GET /api/v1/compliance/analyses/{analysis_id}/findings/{finding_id}/chat
Response: [{id, role, content, created_at}]
POST /api/v1/compliance/analyses/{analysis_id}/findings/{finding_id}/suggestions
Response: {questions: [string, string, string]}
```
### C2: Grounded Context Construction
New function in `pipeline.py`:
```python
def build_finding_context(finding: FindingRecord, analysis: AnalysisRecord) -> str:
"""
Build a grounded system context string for a finding chat thread.
Combines finding details with analysis metadata for LLM grounding.
"""
return (
f"Document: {analysis.doc_name}\n"
f"Standard: {analysis.standard_name}\n"
f"Finding [{finding.seq}]: {finding.title}\n"
f"Status: {finding.status}\n"
f"Clause reference: {finding.clause_ref or 'N/A'}\n"
f"Description: {finding.description}\n"
f"Overall conclusion: {analysis.conclusion}\n"
)
```
This string is prepended to the system prompt for every chat call — replacing the fragile `segment_context` approach.
### C3: Multi-turn Context
Chat handler fetches existing messages from `finding_chat_messages` via `repo.get_messages(finding_id)` and prepends them to the LLM call as `[{"role": "user"/"assistant", "content": "..."}]` message history. Max history: 10 most recent messages (5 turns) to avoid token overflow.
After each LLM response, both the user message and assistant message are saved via `repo.save_message()`.
### C4: Suggestion Generation
New function in `pipeline.py`:
```python
SUGGESTION_PROMPTS = {
"non_compliant": "Generate 3 questions focused on remediation steps and timeline.",
"partial": "Generate 3 questions focused on identifying the compliance gap.",
"compliant": "Generate 3 questions focused on maintaining and evidencing compliance.",
}
def generate_suggestions(finding: FindingRecord, analysis: AnalysisRecord, llm_client) -> list[str]:
"""
Generate 3 context-aware follow-up questions for a finding chat thread.
Returns a list of 3 question strings. Falls back to generic questions on error.
"""
focus = SUGGESTION_PROMPTS.get(finding.status, SUGGESTION_PROMPTS["partial"])
context = build_finding_context(finding, analysis)
prompt = f"{context}\n\n{focus}\nReturn JSON: {{\"questions\": [\"...\", \"...\", \"...\"]}}"
# ... call LLM, parse JSON, return list ...
# Fallback on error:
return ["What are the specific requirements?", "What is the remediation timeline?", "Which regulation clause applies?"]
```
### C5: Frontend — Finding Chat Drawer
New component: `frontend/src/pages/Compliance/FindingChatDrawer.tsx`
Drawer slides in from the right (CSS: `position: fixed; right: 0; width: 420px`), reusing existing CSS variables (`--surface`, `--border`, `--accent`).
Structure:
- Header: finding title + close button
- Suggestions section: 3 chip buttons (only shown before first user message; hidden after)
- Message list: scrollable, same bubble style as RagChat
- Composer: textarea + send button, same pattern as RagChat composer
State managed in `PageStateContext.ComplianceState`:
- `activeFindingId: string | null` — which finding's drawer is open
- Drawer open/close controlled by `activeFindingId !== null`
On open:
1. `GET /analyses/{id}/findings/{fid}/chat` → restore history
2. If history is empty: `POST /findings/{fid}/suggestions` → show chips
Each finding card in `CompliancePage.tsx` gains a `💬 Chat` button that sets `activeFindingId`.
### C — Files Changed
| File | Action |
|------|--------|
| `backend/app/api/routes/compliance.py` | Add 3 new finding-chat endpoints |
| `backend/app/application/compliance/pipeline.py` | Add `build_finding_context`, `generate_suggestions` |
| `backend/app/infrastructure/compliance/repository.py` | Add `save_message`, `get_messages` (already in port) |
| `frontend/src/pages/Compliance/FindingChatDrawer.tsx` | New component |
| `frontend/src/pages/Compliance/CompliancePage.tsx` | Add Chat button to finding cards, render drawer |
| `frontend/src/contexts/PageStateContext.tsx` | Add `activeFindingId` to `ComplianceState` |
---
## Implementation Order
Direction A must be completed first (parallel processing changes the route handler that B's auto-save hook attaches to). B must be completed before C (C's FK references require B's tables and repository).
```
A (parallel + bug fixes + reranker stub)
└→ B (schema migration + history + DOCX)
└→ C (finding chat + suggestions)
```
---
## Non-Goals
- PDF export (DOCX only; users convert via Word/WPS)
- Cross-encoder reranking (stub reserved, not implemented)
- Scheduled/automatic crawling
- User-level history isolation (all users share history — global visibility)
- Prompt version management or A/B testing
---
## Constraints
- Backend comments and docstrings: English only
- No new top-level libraries beyond those already in `requirements.txt` (`tenacity`, `python-docx`, `psycopg2-binary` are all present)
- `DOCUMENT_REPOSITORY_BACKEND=postgres``PostgresComplianceRepository`; any other value → raise `NotImplementedError` with a clear message (no mock fallback for compliance history)
- Git commits are made by the user, never automated

View File

@@ -0,0 +1,421 @@
# Internationalisation (i18n) Design — Frontend Chinese/English Toggle
**Date:** 2026-06-08
**Scope:** UI framework strings only (nav labels, button labels, status messages, placeholders). Mock data, API-returned content, and domain regulation text are explicitly excluded.
---
## Goals
Add a language toggle button (EN ↔ 中) in the Sidebar footer, immediately left of the existing theme-toggle button, so users can switch the UI between English and Simplified Chinese. Default language is English on every page load; preference is not persisted across sessions.
---
## Architecture
### Approach
Custom `LanguageContext` following the same pattern as the existing `ThemeContext`. No external library dependencies. Translation strings live in two TypeScript modules (`locales/en.ts` and `locales/zh.ts`) that export identical-shape objects.
### Layering
```
src/
├── contexts/
│ └── LanguageContext.tsx # type Lang, LanguageProvider, useLanguage()
└── locales/
├── en.ts # English translations (default)
└── zh.ts # Simplified Chinese translations
```
`LanguageProvider` wraps the entire app in `App.tsx` — outermost provider so every component can consume it.
### Context interface
```ts
type Lang = 'en' | 'zh';
interface LanguageContextValue {
lang: Lang;
t: Translations; // typed translation object
toggleLang: () => void;
}
```
`useState<Lang>('en')` — hardcoded default, no localStorage read on mount.
### Translation object shape (both files export `Translations`)
```ts
export interface Translations {
nav: {
groupMain: string;
groupWorkbench: string;
groupChat: string;
overview: string;
signals: string;
status: string;
documents: string;
compliance: string;
chat: string;
};
sidebar: {
toggleTheme: string;
toggleLang: string;
signOut: string;
};
overview: {
eyebrow: string;
heroTitle: string;
heroDesc: string;
openDashboard: string;
jumpToChat: string;
sectionHowItWorks: string;
sectionScreens: string;
stepUpload: string; stepUploadDesc: string;
stepProcess: string; stepProcessDesc: string;
stepMonitor: string; stepMonitorDesc: string;
stepAnalyze: string; stepAnalyzeDesc: string;
stepReview: string; stepReviewDesc: string;
stepChat: string; stepChatDesc: string;
statScreens: string;
statFlows: string;
statReviewPosture: string;
navLiveHealth: string;
navRegulatoryChanges: string;
navUploadDocs: string;
navComplianceWorkspace: string;
navChatCited: string;
navKPIs: string;
};
signals: {
topbarTitle: string;
topbarSub: string;
searchPlaceholder: string;
refreshBtn: string;
crawlingBtn: string;
statTotal: string;
statHigh: string;
statMedium: string;
statLast90: string;
badgeFinal: string;
badgeDraft: string;
badgeUrgent: string;
badgePublished: string;
emptySelectSignal: string;
runAnalysis: string;
stopBtn: string;
sourceLink: string;
tabOverview: string;
tabObligations: string;
tabImpact: string;
tabChanges: string;
cardScopeHeader: string;
cardObligationsHeader: string;
obligationsEmpty: string;
colObligationDesc: string;
colSubject: string;
colType: string;
colDeadline: string;
deadlinePending: string;
cardAffectedDocs: string;
noAffectedDocs: string;
cardAIImpact: string;
footerText: string;
statusConnecting: string;
statusNoStream: string;
statusCrawling: string;
statusProcessing: string;
statusComplete: string;
statusUpdateComplete: string;
statusError: string;
statusConnFailed: string;
};
status: {
topbarTitle: string;
searchPlaceholder: string;
exportBtn: string;
refreshBtn: string;
newUploadBtn: string;
statTotal: string;
statIndexed: string;
statFailed: string;
statChunks: string;
statCoverage: string;
cardHealth: string;
badgeOnline: string;
badgeError: string;
badgeDegraded: string;
badgeUnknown: string;
healthEndpointError: string;
serviceEnabled: string;
serviceDisabled: string;
serviceNotLoaded: string;
cardConfig: string;
labelLLMProvider: string;
labelLLMModel: string;
labelEmbeddingModel: string;
labelEmbeddingDim: string;
labelMilvusCollection: string;
labelParserBackend: string;
labelChunkBackend: string;
labelParserFailureMode: string;
configLoadError: string;
cardBreakdown: string;
breakdownIndexed: string;
breakdownProcessing: string;
breakdownFailed: string;
cardRuntime: string;
labelActiveSessions: string;
labelSessionCapacity: string;
labelReranker: string;
labelBM25: string;
statusActive: string;
statusUnavailable: string;
footerAllOk: string;
footerDegraded: string;
footerChecking: string;
};
docs: {
topbarTitle: string;
searchPlaceholder: string;
refreshBtn: string;
uploadBtn: string;
confirmDeleteTitle: string;
cancelBtn: string;
deleteBtn: string;
filterAll: string;
filterReady: string;
filterProcessing: string;
filterFailed: string;
filterPending: string;
filterAllTypes: string;
selectedCount: string; // '{n} document(s) selected' — use {n} placeholder
deleteSelected: string;
colName: string;
colStatus: string;
colUploaded: string;
colChunks: string;
colSize: string;
colType: string;
colActions: string;
loading: string;
emptyNoDocuments: string;
emptyNoMatch: string;
footerCount: string; // '{n} of {m} document(s)'
titleDownload: string;
titleRetry: string;
titleDelete: string;
confirmSingle: string; // '{name}' placeholder
confirmBatch: string; // '{n}' placeholder
};
compliance: {
topbarTitle: string;
searchPlaceholder: string;
clearBtn: string;
exportBtn: string;
exportJSON: string;
exportText: string;
newAnalysisBtn: string;
statusAnalyzing: string;
statusComplete: string;
statusError: string;
emptyTitle: string;
emptyDesc: string;
colRetrieved: string; // 'Retrieved Regulations {count}'
retrievingMsg: string;
defaultRegulation: string;
matchSuffix: string;
colParagraph: string;
extractingMsg: string;
noTextExtracted: string;
stagesHeader: string;
stageExtraction: string;
stageClauseSplit: string;
stageRetrieval: string;
stageSynthesis: string;
colFindings: string; // 'Findings {count}'
gapInProgress: string;
askAIBtn: string;
chatBtn: string;
conclusionHeader: string;
riskScoreTooltip: string;
statusCovered: string;
statusGap: string;
statusCritical: string;
statusInfo: string;
sourceTypePasted: string;
sourceTypeIndexed: string;
sourceTypeUploaded: string;
chatSidebarHeader: string;
chatThinking: string;
quickQ1: string;
quickQ2: string;
quickQ3: string;
chatPlaceholder: string;
sendBtn: string;
analysisFailed: string;
exportReportHeader: string;
exportSectionParagraph: string;
exportSectionFindings: string;
exportSectionConclusion: string;
exportSectionActions: string;
historyHeader: string;
downloadReport: string;
historyEmpty: string;
historyDeleteConfirm: string;
drawerClose: string;
drawerChatEmpty: string;
drawerSuggestionsHeader: string;
};
ragchat: {
topbarTitle: string;
exportBtn: string;
quickPromptsHeader: string;
inputPlaceholder: string;
citationsHeader: string; // 'Sources {count}'
citationsEmpty: string;
jumpToSource: string; // 'Jump to source [N]'
apiError: string;
quickPrompt1: string;
quickPrompt2: string;
quickPrompt3: string;
quickPrompt4: string;
};
}
```
---
## Language Toggle Button
Location: `Sidebar.tsx` footer `<div style={{ display: 'flex', gap: 4 }}>`.
Inserted **left of** the existing theme button:
```tsx
<button className="theme-btn" onClick={toggleLang} title={t.sidebar.toggleLang}>
{lang === 'en' ? 'EN' : '中'}
</button>
```
- Reuses existing `theme-btn` CSS class — no new styles needed.
- Displays two-character label: `EN` or `中`.
- `title` attribute (tooltip) translates with the rest of the UI.
---
## Translation Files (complete values)
### `locales/en.ts` (English — default)
Key values (representative; full file contains all keys above):
```ts
nav: { groupMain: 'Main', groupWorkbench: 'Workbench', groupChat: 'Chat',
overview: 'Overview', signals: 'Regulatory Signals', status: 'System Status',
documents: 'Documents', compliance: 'Compliance Analysis', chat: 'Regulation Q&A' },
sidebar: { toggleTheme: 'Toggle theme', toggleLang: 'Switch language', signOut: 'Sign out' },
signals: { refreshBtn: 'Refresh Sources', crawlingBtn: 'Crawling...', ... },
docs: { uploadBtn: 'Upload document', deleteBtn: 'Delete', cancelBtn: 'Cancel', ... },
compliance: { newAnalysisBtn: 'New analysis', analyzeBtn: 'Analyze', sendBtn: 'Send', ... },
ragchat: { exportBtn: 'Export chat', inputPlaceholder: 'Ask about your regulations…', ... },
```
### `locales/zh.ts` (Simplified Chinese)
Key values:
```ts
nav: { groupMain: '主菜单', groupWorkbench: '工作台', groupChat: '对话',
overview: '概览', signals: '法规信号', status: '系统状态',
documents: '文档管理', compliance: '合规分析', chat: '法规问答' },
sidebar: { toggleTheme: '切换主题', toggleLang: '切换语言', signOut: '退出' },
signals: { refreshBtn: '刷新数据源', crawlingBtn: '抓取中...', ... },
docs: { uploadBtn: '上传文档', deleteBtn: '删除', cancelBtn: '取消', ... },
compliance: { newAnalysisBtn: '新建分析', analyzeBtn: '开始分析', sendBtn: '发送', ... },
ragchat: { exportBtn: '导出对话', inputPlaceholder: '请输入关于法规的问题…', ... },
```
---
## App.tsx Provider Wrapping
```tsx
// Before
<ThemeProvider>
<AuthProvider>
<PageStateProvider>
<AppRouter />
</PageStateProvider>
</AuthProvider>
</ThemeProvider>
// After
<LanguageProvider>
<ThemeProvider>
<AuthProvider>
<PageStateProvider>
<AppRouter />
</PageStateProvider>
</AuthProvider>
</ThemeProvider>
</LanguageProvider>
```
`LanguageProvider` is outermost so it is available to all components including the theme toggle itself.
---
## Usage in Components
```tsx
import { useLanguage } from '../../contexts/LanguageContext';
function MyComponent() {
const { t } = useLanguage();
return <button>{t.docs.uploadBtn}</button>;
}
```
No wrapping needed — `t` is always the correct object for the current language.
---
## Files Changed
| File | Action |
|------|--------|
| `src/contexts/LanguageContext.tsx` | New — `LanguageProvider`, `useLanguage()`, `Lang` type |
| `src/locales/en.ts` | New — complete English `Translations` object |
| `src/locales/zh.ts` | New — complete Chinese `Translations` object |
| `src/App.tsx` | Add `<LanguageProvider>` wrapper |
| `src/components/layout/Sidebar.tsx` | Add language toggle button; replace nav group titles and labels with `t.nav.*` |
| `src/pages/Overview/OverviewPage.tsx` | Replace all UI strings with `t.overview.*` |
| `src/pages/Perception/PerceptionPage.tsx` | Replace all UI strings with `t.signals.*` |
| `src/pages/Status/StatusPage.tsx` | Replace all UI strings with `t.status.*` |
| `src/pages/Docs/DocsPage.tsx` | Replace all UI strings with `t.docs.*` |
| `src/pages/Compliance/CompliancePage.tsx` | Replace all UI strings with `t.compliance.*` |
| `src/pages/RagChat/RagChatPage.tsx` | Replace all UI strings with `t.ragchat.*` |
| `src/pages/Compliance/HistoryRail.tsx` | Replace UI strings with `t.compliance.*` |
| `src/pages/Compliance/FindingChatDrawer.tsx` | Replace UI strings with `t.compliance.*` |
---
## Non-Goals
- Persistence across sessions (no localStorage for language preference)
- More than two languages
- RTL layout support
- Pluralisation helpers (simple string substitution with `{n}` placeholders is sufficient — callers replace via `t.docs.selectedCount.replace('{n}', String(count))`)
- Translation of API-returned content, mock data, regulation names, or document file names
- Date/number formatting localisation
---
## Constraints
- Zero new npm dependencies
- Follow existing `ThemeContext` pattern exactly
- Backend comments/docstrings: English only (no backend changes in this feature)
- Git commits made by the user, never automated