update for 1. 优化 2.中英切换

2026-06-10 11:10:36 +08:00
parent e7963b267e
commit 9212747e1b
42 changed files with 7866 additions and 278 deletions
--- a/docs/superpowers/specs/2026-06-08-compliance-enhancement-design.md
+++ b/docs/superpowers/specs/2026-06-08-compliance-enhancement-design.md
@@ -0,0 +1,459 @@
+# Compliance Analysis Enhancement Design
+
+**Date:** 2026-06-08
+**Directions:** A (Analysis Quality) + B (History & Reports) + C (Deep Chat)
+**Approach:** Three independent but coordinated feature sets sharing one DB schema (method one / structured tables).
+
+---
+
+## Goals
+
+1. **A — Analysis Quality:** Parallel clause processing (3-5× speed), fix `highlight_terms` bug (always returns empty), add LLM retry with tenacity, reserve `PassThroughReranker` for future cross-encoder work.
+2. **B — Analysis History & Reports:** Auto-save every completed analysis to PostgreSQL, history rail in UI, per-record DOCX export, delete with confirmation.
+3. **C — Deep Chat:** Per-finding persistent chat threads grounded in real retrieved text, LLM-generated suggestion questions, multi-turn memory.
+
+---
+
+## Architecture Overview
+
+### Layering Rules (must not be violated)
+
+```
+api/routes/         →  thin HTTP handlers, SSE generators only
+application/        →  orchestration logic (pipeline.py)
+domain/ports/       →  ABCs, no implementation
+infrastructure/     →  DB, docx, external calls
+shared/bootstrap.py →  composition root, wires everything
+```
+
+New business logic goes in `application/compliance/pipeline.py` and domain ports. Never in `services/*` or `workflows/*`.
+
+### Shared Database Schema (B + C)
+
+Three tables, created together so C's FK references are valid from day one:
+
+```sql
+CREATE TABLE compliance_analyses (
+    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
+    created_by      VARCHAR(255),
+    doc_name        VARCHAR(500),
+    standard_name   VARCHAR(500),
+    risk_score      INTEGER,
+    conclusion      TEXT,
+    actions         JSONB,
+    para_text       TEXT,
+    highlight_terms JSONB
+);
+
+CREATE TABLE compliance_findings (
+    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    analysis_id UUID NOT NULL REFERENCES compliance_analyses(id) ON DELETE CASCADE,
+    seq         INTEGER NOT NULL,
+    title       VARCHAR(500),
+    description TEXT,
+    status      VARCHAR(50),
+    clause_ref  VARCHAR(200)
+);
+
+CREATE TABLE finding_chat_messages (
+    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    analysis_id UUID NOT NULL REFERENCES compliance_analyses(id) ON DELETE CASCADE,
+    finding_id  UUID NOT NULL REFERENCES compliance_findings(id) ON DELETE CASCADE,
+    role        VARCHAR(20) NOT NULL,  -- 'user' | 'assistant'
+    content     TEXT NOT NULL,
+    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
+);
+```
+
+---
+
+## Direction A — Analysis Quality
+
+### A1: Parallel Clause Processing
+
+**Current:** Route handler has a sequential `for i, clause in enumerate(clauses)` loop. Each iteration calls `retrieve_for_clause()` then `check_clause_compliance()` synchronously via `asyncio.to_thread`.
+
+**Change:** Extract a `process_single_clause(clause, idx, ...) -> dict` function in `pipeline.py`, then replace the loop with `asyncio.gather`:
+
+```python
+async def run_clauses_parallel(clauses, retrieval_svc, llm_client, standard_name, para_text):
+    tasks = [
+        asyncio.to_thread(process_single_clause, clause, i, retrieval_svc, llm_client, standard_name, para_text)
+        for i, clause in enumerate(clauses)
+    ]
+    return await asyncio.gather(*tasks, return_exceptions=True)
+```
+
+Results are yielded to the SSE stream in original order. Exceptions from individual clauses are caught and emitted as `{type: "error", clause_index: i}` events rather than crashing the whole stream.
+
+### A2: Fix highlight_terms
+
+**Root cause:** `synthesize_conclusion()` passes the LLM response through `json.loads()` but the LLM often wraps output in markdown fences (` ```json ... ``` `), causing a parse failure and silent fallback to `[]`.
+
+**Fix in `pipeline.py`:**
+
+```python
+import re
+
+def _extract_json(text: str) -> dict:
+    """Strip markdown fences then parse JSON. Raises ValueError on failure."""
+    cleaned = re.sub(r"^```(?:json)?\s*|\s*```$", "", text.strip(), flags=re.MULTILINE)
+    return json.loads(cleaned)
+```
+
+Apply `_extract_json` in `synthesize_conclusion()` instead of bare `json.loads`. Wrap with `@retry` (see A3) so transient parse failures get a second attempt.
+
+### A3: LLM Retry with tenacity
+
+`tenacity` is already in `requirements.txt` but unused. Add to all LLM calls in `pipeline.py`:
+
+```python
+from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
+
+@retry(
+    stop=stop_after_attempt(3),
+    wait=wait_exponential(multiplier=1, min=1, max=4),
+    retry=retry_if_exception_type((httpx.HTTPError, ValueError)),
+    reraise=True,
+)
+def _call_llm_with_retry(client, prompt: str) -> str:
+    """Call LLM and return raw text. Retries on HTTP errors and JSON parse failures."""
+    ...
+```
+
+On final failure, the calling function catches and emits `{type: "error", text: "LLM call failed after 3 attempts"}` to the SSE stream.
+
+### A4: PassThroughReranker (future-ready stub)
+
+`domain/retrieval/ports.py` already defines a `Reranker` ABC. Add the no-op implementation:
+
+**New file:** `backend/app/infrastructure/retrieval/reranker.py`
+
+```python
+from app.domain.retrieval.ports import Reranker, RetrievedChunk
+
+class PassThroughReranker(Reranker):
+    """No-op reranker. Replace with CrossEncoderReranker when a local model is available."""
+
+    def rerank(self, query: str, chunks: list[RetrievedChunk], top_k: int) -> list[RetrievedChunk]:
+        return chunks[:top_k]
+```
+
+Register in `shared/bootstrap.py` as the default `Reranker` implementation.
+
+### A — Files Changed
+
+| File | Action |
+|------|--------|
+| `backend/app/application/compliance/pipeline.py` | Add `process_single_clause`, `run_clauses_parallel`, `_extract_json`, `_call_llm_with_retry` |
+| `backend/app/api/routes/compliance.py` | Replace sequential loop with `await run_clauses_parallel(...)` |
+| `backend/app/infrastructure/retrieval/reranker.py` | New — `PassThroughReranker` |
+| `backend/app/shared/bootstrap.py` | Register `PassThroughReranker` |
+
+---
+
+## Direction B — History & Reports
+
+### B1: Domain Port
+
+**New file:** `backend/app/domain/compliance/ports.py`
+
+```python
+from abc import ABC, abstractmethod
+from dataclasses import dataclass, field
+from datetime import datetime
+from typing import Optional
+
+@dataclass
+class FindingRecord:
+    id: str
+    analysis_id: str
+    seq: int
+    title: str
+    description: str
+    status: str
+    clause_ref: Optional[str] = None
+
+@dataclass
+class AnalysisRecord:
+    id: str
+    created_at: datetime
+    created_by: Optional[str]
+    doc_name: str
+    standard_name: str
+    risk_score: int
+    conclusion: str
+    actions: list
+    para_text: str
+    highlight_terms: list
+    findings: list[FindingRecord] = field(default_factory=list)
+
+class ComplianceRepository(ABC):
+    @abstractmethod
+    def save_analysis(self, record: AnalysisRecord) -> str: ...
+    @abstractmethod
+    def list_analyses(self, limit: int = 50, offset: int = 0) -> list[AnalysisRecord]: ...
+    @abstractmethod
+    def get_analysis(self, analysis_id: str) -> Optional[AnalysisRecord]: ...
+    @abstractmethod
+    def delete_analysis(self, analysis_id: str) -> None: ...
+    @abstractmethod
+    def save_message(self, analysis_id: str, finding_id: str, role: str, content: str) -> str: ...
+    @abstractmethod
+    def get_messages(self, finding_id: str) -> list[dict]: ...
+```
+
+### B2: PostgresComplianceRepository
+
+**New file:** `backend/app/infrastructure/compliance/repository.py`
+
+Implements `ComplianceRepository` using `psycopg2` (already in requirements). Connection string from `settings.DATABASE_URL`. Key methods:
+
+- `save_analysis`: INSERT into `compliance_analyses`, then bulk INSERT findings into `compliance_findings`, return `analysis_id` (UUID string).
+- `list_analyses`: SELECT with JOIN on findings count, ORDER BY `created_at DESC`, supports limit/offset.
+- `get_analysis`: SELECT analysis + all findings by `analysis_id`.
+- `delete_analysis`: DELETE cascades to findings and chat messages via FK.
+- `save_message` / `get_messages`: INSERT/SELECT on `finding_chat_messages`.
+
+Uses a connection pool (simple `psycopg2.pool.ThreadedConnectionPool`, min=1, max=5).
+
+### B3: Auto-save Hook
+
+In the SSE generator in `compliance.py`, after the `done` event is assembled:
+
+```python
+# After yielding the done event
+if repo is not None:
+    record = AnalysisRecord(
+        id="",  # will be assigned by DB
+        created_at=datetime.utcnow(),
+        created_by=current_user,
+        doc_name=doc_name,
+        standard_name=standard_name,
+        risk_score=done_payload["risk_score"],
+        conclusion=done_payload["conclusion"],
+        actions=done_payload["actions"],
+        para_text=done_payload["para_text"],
+        highlight_terms=done_payload["highlight_terms"],
+        findings=[FindingRecord(...) for f in accumulated_findings],
+    )
+    analysis_id = await asyncio.to_thread(repo.save_analysis, record)
+    # Emit an extra SSE event so frontend receives the analysis_id
+    yield f"data: {json.dumps({'type': 'saved', 'analysis_id': analysis_id})}\n\n"
+```
+
+### B4: New API Endpoints
+
+Added to `backend/app/api/routes/compliance.py`:
+
+```
+GET    /api/v1/compliance/history
+       Query params: limit=20&offset=0
+       Response: [{id, created_at, doc_name, standard_name, risk_score, finding_count}]
+
+GET    /api/v1/compliance/history/{analysis_id}
+       Response: full AnalysisRecord including findings list
+
+DELETE /api/v1/compliance/history/{analysis_id}
+       Response: 204 No Content
+
+GET    /api/v1/compliance/history/{analysis_id}/download
+       Response: DOCX file (application/vnd.openxmlformats-officedocument.wordprocessingml.document)
+```
+
+### B5: DOCX Export
+
+**New file:** `backend/app/infrastructure/compliance/docx_export.py`
+
+Uses `python-docx` (already in requirements). Generates a structured report:
+
+- Cover: document name, standard, date, risk score badge
+- Executive summary: conclusion paragraph
+- Findings table: seq / title / status / clause_ref / description
+- Action items: numbered list
+- Footer: generated by AI Regulation Analysis System
+
+```python
+def generate_docx(record: AnalysisRecord) -> bytes:
+    """Generate a DOCX compliance report and return as bytes."""
+    doc = Document()
+    # ... build document ...
+    buf = BytesIO()
+    doc.save(buf)
+    return buf.getvalue()
+```
+
+### B6: Frontend — History Rail
+
+`CompliancePage.tsx` gains a left rail (same layout pattern as RagChat's `history-pane`):
+
+```
+┌──────────────┬─────────────────────────────────┐
+│ History      │  Main Analysis Area              │
+│ ──────────   │                                  │
+│ 2026-06-08   │  (current analysis or loaded     │
+│ doc.pdf      │   read-only historical record)   │
+│ ⚠ 72  [↓][×]│                                  │
+│ ──────────   │                                  │
+│ 2026-06-07   │                                  │
+│ csms.pdf     │                                  │
+│ ✓ 15  [↓][×]│                                  │
+└──────────────┴─────────────────────────────────┘
+```
+
+- `[↓]` triggers `GET /history/{id}/download` and saves the DOCX file
+- `[×]` shows a confirmation dialog, then calls `DELETE /history/{id}`
+- Clicking a row loads that analysis into the main area in read-only mode
+- `PageStateContext.ComplianceState` gains `analysisId: string | null` and `isReadOnly: boolean`
+
+On mount, the rail calls `GET /history?limit=20` to populate the list. The list re-fetches after delete or after a new analysis completes (triggered by the `saved` SSE event).
+
+### B — Files Changed
+
+| File | Action |
+|------|--------|
+| `backend/app/domain/compliance/ports.py` | New — `ComplianceRepository` ABC + data classes |
+| `backend/app/infrastructure/compliance/repository.py` | New — `PostgresComplianceRepository` |
+| `backend/app/infrastructure/compliance/docx_export.py` | New — `generate_docx()` |
+| `backend/app/api/routes/compliance.py` | Add history endpoints + auto-save hook |
+| `backend/app/shared/bootstrap.py` | Register `PostgresComplianceRepository` |
+| `frontend/src/pages/Compliance/CompliancePage.tsx` | Add History Rail |
+| `frontend/src/contexts/PageStateContext.tsx` | Add `analysisId`, `isReadOnly` to `ComplianceState` |
+
+---
+
+## Direction C — Deep Chat
+
+### C1: New Chat Endpoints
+
+Replace the existing `/compliance/chat/{segment_id}` (kept for backward compatibility but deprecated) with finding-scoped endpoints:
+
+```
+POST /api/v1/compliance/analyses/{analysis_id}/findings/{finding_id}/chat
+     Body: {query: string}
+     Response: SSE stream — chunk / done / error events
+
+GET  /api/v1/compliance/analyses/{analysis_id}/findings/{finding_id}/chat
+     Response: [{id, role, content, created_at}]
+
+POST /api/v1/compliance/analyses/{analysis_id}/findings/{finding_id}/suggestions
+     Response: {questions: [string, string, string]}
+```
+
+### C2: Grounded Context Construction
+
+New function in `pipeline.py`:
+
+```python
+def build_finding_context(finding: FindingRecord, analysis: AnalysisRecord) -> str:
+    """
+    Build a grounded system context string for a finding chat thread.
+    Combines finding details with analysis metadata for LLM grounding.
+    """
+    return (
+        f"Document: {analysis.doc_name}\n"
+        f"Standard: {analysis.standard_name}\n"
+        f"Finding [{finding.seq}]: {finding.title}\n"
+        f"Status: {finding.status}\n"
+        f"Clause reference: {finding.clause_ref or 'N/A'}\n"
+        f"Description: {finding.description}\n"
+        f"Overall conclusion: {analysis.conclusion}\n"
+    )
+```
+
+This string is prepended to the system prompt for every chat call — replacing the fragile `segment_context` approach.
+
+### C3: Multi-turn Context
+
+Chat handler fetches existing messages from `finding_chat_messages` via `repo.get_messages(finding_id)` and prepends them to the LLM call as `[{"role": "user"/"assistant", "content": "..."}]` message history. Max history: 10 most recent messages (5 turns) to avoid token overflow.
+
+After each LLM response, both the user message and assistant message are saved via `repo.save_message()`.
+
+### C4: Suggestion Generation
+
+New function in `pipeline.py`:
+
+```python
+SUGGESTION_PROMPTS = {
+    "non_compliant": "Generate 3 questions focused on remediation steps and timeline.",
+    "partial": "Generate 3 questions focused on identifying the compliance gap.",
+    "compliant": "Generate 3 questions focused on maintaining and evidencing compliance.",
+}
+
+def generate_suggestions(finding: FindingRecord, analysis: AnalysisRecord, llm_client) -> list[str]:
+    """
+    Generate 3 context-aware follow-up questions for a finding chat thread.
+    Returns a list of 3 question strings. Falls back to generic questions on error.
+    """
+    focus = SUGGESTION_PROMPTS.get(finding.status, SUGGESTION_PROMPTS["partial"])
+    context = build_finding_context(finding, analysis)
+    prompt = f"{context}\n\n{focus}\nReturn JSON: {{\"questions\": [\"...\", \"...\", \"...\"]}}"
+    # ... call LLM, parse JSON, return list ...
+    # Fallback on error:
+    return ["What are the specific requirements?", "What is the remediation timeline?", "Which regulation clause applies?"]
+```
+
+### C5: Frontend — Finding Chat Drawer
+
+New component: `frontend/src/pages/Compliance/FindingChatDrawer.tsx`
+
+Drawer slides in from the right (CSS: `position: fixed; right: 0; width: 420px`), reusing existing CSS variables (`--surface`, `--border`, `--accent`).
+
+Structure:
+- Header: finding title + close button
+- Suggestions section: 3 chip buttons (only shown before first user message; hidden after)
+- Message list: scrollable, same bubble style as RagChat
+- Composer: textarea + send button, same pattern as RagChat composer
+
+State managed in `PageStateContext.ComplianceState`:
+- `activeFindingId: string | null` — which finding's drawer is open
+- Drawer open/close controlled by `activeFindingId !== null`
+
+On open:
+1. `GET /analyses/{id}/findings/{fid}/chat` → restore history
+2. If history is empty: `POST /findings/{fid}/suggestions` → show chips
+
+Each finding card in `CompliancePage.tsx` gains a `💬 Chat` button that sets `activeFindingId`.
+
+### C — Files Changed
+
+| File | Action |
+|------|--------|
+| `backend/app/api/routes/compliance.py` | Add 3 new finding-chat endpoints |
+| `backend/app/application/compliance/pipeline.py` | Add `build_finding_context`, `generate_suggestions` |
+| `backend/app/infrastructure/compliance/repository.py` | Add `save_message`, `get_messages` (already in port) |
+| `frontend/src/pages/Compliance/FindingChatDrawer.tsx` | New component |
+| `frontend/src/pages/Compliance/CompliancePage.tsx` | Add Chat button to finding cards, render drawer |
+| `frontend/src/contexts/PageStateContext.tsx` | Add `activeFindingId` to `ComplianceState` |
+
+---
+
+## Implementation Order
+
+Direction A must be completed first (parallel processing changes the route handler that B's auto-save hook attaches to). B must be completed before C (C's FK references require B's tables and repository).
+
+```
+A (parallel + bug fixes + reranker stub)
+  └→ B (schema migration + history + DOCX)
+       └→ C (finding chat + suggestions)
+```
+
+---
+
+## Non-Goals
+
+- PDF export (DOCX only; users convert via Word/WPS)
+- Cross-encoder reranking (stub reserved, not implemented)
+- Scheduled/automatic crawling
+- User-level history isolation (all users share history — global visibility)
+- Prompt version management or A/B testing
+
+---
+
+## Constraints
+
+- Backend comments and docstrings: English only
+- No new top-level libraries beyond those already in `requirements.txt` (`tenacity`, `python-docx`, `psycopg2-binary` are all present)
+- `DOCUMENT_REPOSITORY_BACKEND=postgres` → `PostgresComplianceRepository`; any other value → raise `NotImplementedError` with a clear message (no mock fallback for compliance history)
+- Git commits are made by the user, never automated
--- a/docs/superpowers/specs/2026-06-08-i18n-design.md
+++ b/docs/superpowers/specs/2026-06-08-i18n-design.md
@@ -0,0 +1,421 @@
+# Internationalisation (i18n) Design — Frontend Chinese/English Toggle
+
+**Date:** 2026-06-08
+**Scope:** UI framework strings only (nav labels, button labels, status messages, placeholders). Mock data, API-returned content, and domain regulation text are explicitly excluded.
+
+---
+
+## Goals
+
+Add a language toggle button (EN ↔ 中) in the Sidebar footer, immediately left of the existing theme-toggle button, so users can switch the UI between English and Simplified Chinese. Default language is English on every page load; preference is not persisted across sessions.
+
+---
+
+## Architecture
+
+### Approach
+
+Custom `LanguageContext` following the same pattern as the existing `ThemeContext`. No external library dependencies. Translation strings live in two TypeScript modules (`locales/en.ts` and `locales/zh.ts`) that export identical-shape objects.
+
+### Layering
+
+```
+src/
+├── contexts/
+│   └── LanguageContext.tsx    # type Lang, LanguageProvider, useLanguage()
+└── locales/
+    ├── en.ts                  # English translations (default)
+    └── zh.ts                  # Simplified Chinese translations
+```
+
+`LanguageProvider` wraps the entire app in `App.tsx` — outermost provider so every component can consume it.
+
+### Context interface
+
+```ts
+type Lang = 'en' | 'zh';
+
+interface LanguageContextValue {
+  lang: Lang;
+  t: Translations;       // typed translation object
+  toggleLang: () => void;
+}
+```
+
+`useState<Lang>('en')` — hardcoded default, no localStorage read on mount.
+
+### Translation object shape (both files export `Translations`)
+
+```ts
+export interface Translations {
+  nav: {
+    groupMain: string;
+    groupWorkbench: string;
+    groupChat: string;
+    overview: string;
+    signals: string;
+    status: string;
+    documents: string;
+    compliance: string;
+    chat: string;
+  };
+  sidebar: {
+    toggleTheme: string;
+    toggleLang: string;
+    signOut: string;
+  };
+  overview: {
+    eyebrow: string;
+    heroTitle: string;
+    heroDesc: string;
+    openDashboard: string;
+    jumpToChat: string;
+    sectionHowItWorks: string;
+    sectionScreens: string;
+    stepUpload: string; stepUploadDesc: string;
+    stepProcess: string; stepProcessDesc: string;
+    stepMonitor: string; stepMonitorDesc: string;
+    stepAnalyze: string; stepAnalyzeDesc: string;
+    stepReview: string; stepReviewDesc: string;
+    stepChat: string; stepChatDesc: string;
+    statScreens: string;
+    statFlows: string;
+    statReviewPosture: string;
+    navLiveHealth: string;
+    navRegulatoryChanges: string;
+    navUploadDocs: string;
+    navComplianceWorkspace: string;
+    navChatCited: string;
+    navKPIs: string;
+  };
+  signals: {
+    topbarTitle: string;
+    topbarSub: string;
+    searchPlaceholder: string;
+    refreshBtn: string;
+    crawlingBtn: string;
+    statTotal: string;
+    statHigh: string;
+    statMedium: string;
+    statLast90: string;
+    badgeFinal: string;
+    badgeDraft: string;
+    badgeUrgent: string;
+    badgePublished: string;
+    emptySelectSignal: string;
+    runAnalysis: string;
+    stopBtn: string;
+    sourceLink: string;
+    tabOverview: string;
+    tabObligations: string;
+    tabImpact: string;
+    tabChanges: string;
+    cardScopeHeader: string;
+    cardObligationsHeader: string;
+    obligationsEmpty: string;
+    colObligationDesc: string;
+    colSubject: string;
+    colType: string;
+    colDeadline: string;
+    deadlinePending: string;
+    cardAffectedDocs: string;
+    noAffectedDocs: string;
+    cardAIImpact: string;
+    footerText: string;
+    statusConnecting: string;
+    statusNoStream: string;
+    statusCrawling: string;
+    statusProcessing: string;
+    statusComplete: string;
+    statusUpdateComplete: string;
+    statusError: string;
+    statusConnFailed: string;
+  };
+  status: {
+    topbarTitle: string;
+    searchPlaceholder: string;
+    exportBtn: string;
+    refreshBtn: string;
+    newUploadBtn: string;
+    statTotal: string;
+    statIndexed: string;
+    statFailed: string;
+    statChunks: string;
+    statCoverage: string;
+    cardHealth: string;
+    badgeOnline: string;
+    badgeError: string;
+    badgeDegraded: string;
+    badgeUnknown: string;
+    healthEndpointError: string;
+    serviceEnabled: string;
+    serviceDisabled: string;
+    serviceNotLoaded: string;
+    cardConfig: string;
+    labelLLMProvider: string;
+    labelLLMModel: string;
+    labelEmbeddingModel: string;
+    labelEmbeddingDim: string;
+    labelMilvusCollection: string;
+    labelParserBackend: string;
+    labelChunkBackend: string;
+    labelParserFailureMode: string;
+    configLoadError: string;
+    cardBreakdown: string;
+    breakdownIndexed: string;
+    breakdownProcessing: string;
+    breakdownFailed: string;
+    cardRuntime: string;
+    labelActiveSessions: string;
+    labelSessionCapacity: string;
+    labelReranker: string;
+    labelBM25: string;
+    statusActive: string;
+    statusUnavailable: string;
+    footerAllOk: string;
+    footerDegraded: string;
+    footerChecking: string;
+  };
+  docs: {
+    topbarTitle: string;
+    searchPlaceholder: string;
+    refreshBtn: string;
+    uploadBtn: string;
+    confirmDeleteTitle: string;
+    cancelBtn: string;
+    deleteBtn: string;
+    filterAll: string;
+    filterReady: string;
+    filterProcessing: string;
+    filterFailed: string;
+    filterPending: string;
+    filterAllTypes: string;
+    selectedCount: string;   // '{n} document(s) selected' — use {n} placeholder
+    deleteSelected: string;
+    colName: string;
+    colStatus: string;
+    colUploaded: string;
+    colChunks: string;
+    colSize: string;
+    colType: string;
+    colActions: string;
+    loading: string;
+    emptyNoDocuments: string;
+    emptyNoMatch: string;
+    footerCount: string;     // '{n} of {m} document(s)'
+    titleDownload: string;
+    titleRetry: string;
+    titleDelete: string;
+    confirmSingle: string;   // '{name}' placeholder
+    confirmBatch: string;    // '{n}' placeholder
+  };
+  compliance: {
+    topbarTitle: string;
+    searchPlaceholder: string;
+    clearBtn: string;
+    exportBtn: string;
+    exportJSON: string;
+    exportText: string;
+    newAnalysisBtn: string;
+    statusAnalyzing: string;
+    statusComplete: string;
+    statusError: string;
+    emptyTitle: string;
+    emptyDesc: string;
+    colRetrieved: string;    // 'Retrieved Regulations {count}'
+    retrievingMsg: string;
+    defaultRegulation: string;
+    matchSuffix: string;
+    colParagraph: string;
+    extractingMsg: string;
+    noTextExtracted: string;
+    stagesHeader: string;
+    stageExtraction: string;
+    stageClauseSplit: string;
+    stageRetrieval: string;
+    stageSynthesis: string;
+    colFindings: string;     // 'Findings {count}'
+    gapInProgress: string;
+    askAIBtn: string;
+    chatBtn: string;
+    conclusionHeader: string;
+    riskScoreTooltip: string;
+    statusCovered: string;
+    statusGap: string;
+    statusCritical: string;
+    statusInfo: string;
+    sourceTypePasted: string;
+    sourceTypeIndexed: string;
+    sourceTypeUploaded: string;
+    chatSidebarHeader: string;
+    chatThinking: string;
+    quickQ1: string;
+    quickQ2: string;
+    quickQ3: string;
+    chatPlaceholder: string;
+    sendBtn: string;
+    analysisFailed: string;
+    exportReportHeader: string;
+    exportSectionParagraph: string;
+    exportSectionFindings: string;
+    exportSectionConclusion: string;
+    exportSectionActions: string;
+    historyHeader: string;
+    downloadReport: string;
+    historyEmpty: string;
+    historyDeleteConfirm: string;
+    drawerClose: string;
+    drawerChatEmpty: string;
+    drawerSuggestionsHeader: string;
+  };
+  ragchat: {
+    topbarTitle: string;
+    exportBtn: string;
+    quickPromptsHeader: string;
+    inputPlaceholder: string;
+    citationsHeader: string;    // 'Sources {count}'
+    citationsEmpty: string;
+    jumpToSource: string;       // 'Jump to source [N]'
+    apiError: string;
+    quickPrompt1: string;
+    quickPrompt2: string;
+    quickPrompt3: string;
+    quickPrompt4: string;
+  };
+}
+```
+
+---
+
+## Language Toggle Button
+
+Location: `Sidebar.tsx` footer `<div style={{ display: 'flex', gap: 4 }}>`.
+
+Inserted **left of** the existing theme button:
+
+```tsx
+<button className="theme-btn" onClick={toggleLang} title={t.sidebar.toggleLang}>
+  {lang === 'en' ? 'EN' : '中'}
+</button>
+```
+
+- Reuses existing `theme-btn` CSS class — no new styles needed.
+- Displays two-character label: `EN` or `中`.
+- `title` attribute (tooltip) translates with the rest of the UI.
+
+---
+
+## Translation Files (complete values)
+
+### `locales/en.ts` (English — default)
+
+Key values (representative; full file contains all keys above):
+
+```ts
+nav: { groupMain: 'Main', groupWorkbench: 'Workbench', groupChat: 'Chat',
+       overview: 'Overview', signals: 'Regulatory Signals', status: 'System Status',
+       documents: 'Documents', compliance: 'Compliance Analysis', chat: 'Regulation Q&A' },
+sidebar: { toggleTheme: 'Toggle theme', toggleLang: 'Switch language', signOut: 'Sign out' },
+signals: { refreshBtn: 'Refresh Sources', crawlingBtn: 'Crawling...', ... },
+docs: { uploadBtn: 'Upload document', deleteBtn: 'Delete', cancelBtn: 'Cancel', ... },
+compliance: { newAnalysisBtn: 'New analysis', analyzeBtn: 'Analyze', sendBtn: 'Send', ... },
+ragchat: { exportBtn: 'Export chat', inputPlaceholder: 'Ask about your regulations…', ... },
+```
+
+### `locales/zh.ts` (Simplified Chinese)
+
+Key values:
+
+```ts
+nav: { groupMain: '主菜单', groupWorkbench: '工作台', groupChat: '对话',
+       overview: '概览', signals: '法规信号', status: '系统状态',
+       documents: '文档管理', compliance: '合规分析', chat: '法规问答' },
+sidebar: { toggleTheme: '切换主题', toggleLang: '切换语言', signOut: '退出' },
+signals: { refreshBtn: '刷新数据源', crawlingBtn: '抓取中...', ... },
+docs: { uploadBtn: '上传文档', deleteBtn: '删除', cancelBtn: '取消', ... },
+compliance: { newAnalysisBtn: '新建分析', analyzeBtn: '开始分析', sendBtn: '发送', ... },
+ragchat: { exportBtn: '导出对话', inputPlaceholder: '请输入关于法规的问题…', ... },
+```
+
+---
+
+## App.tsx Provider Wrapping
+
+```tsx
+// Before
+<ThemeProvider>
+  <AuthProvider>
+    <PageStateProvider>
+      <AppRouter />
+    </PageStateProvider>
+  </AuthProvider>
+</ThemeProvider>
+
+// After
+<LanguageProvider>
+  <ThemeProvider>
+    <AuthProvider>
+      <PageStateProvider>
+        <AppRouter />
+      </PageStateProvider>
+    </AuthProvider>
+  </ThemeProvider>
+</LanguageProvider>
+```
+
+`LanguageProvider` is outermost so it is available to all components including the theme toggle itself.
+
+---
+
+## Usage in Components
+
+```tsx
+import { useLanguage } from '../../contexts/LanguageContext';
+
+function MyComponent() {
+  const { t } = useLanguage();
+  return <button>{t.docs.uploadBtn}</button>;
+}
+```
+
+No wrapping needed — `t` is always the correct object for the current language.
+
+---
+
+## Files Changed
+
+| File | Action |
+|------|--------|
+| `src/contexts/LanguageContext.tsx` | New — `LanguageProvider`, `useLanguage()`, `Lang` type |
+| `src/locales/en.ts` | New — complete English `Translations` object |
+| `src/locales/zh.ts` | New — complete Chinese `Translations` object |
+| `src/App.tsx` | Add `<LanguageProvider>` wrapper |
+| `src/components/layout/Sidebar.tsx` | Add language toggle button; replace nav group titles and labels with `t.nav.*` |
+| `src/pages/Overview/OverviewPage.tsx` | Replace all UI strings with `t.overview.*` |
+| `src/pages/Perception/PerceptionPage.tsx` | Replace all UI strings with `t.signals.*` |
+| `src/pages/Status/StatusPage.tsx` | Replace all UI strings with `t.status.*` |
+| `src/pages/Docs/DocsPage.tsx` | Replace all UI strings with `t.docs.*` |
+| `src/pages/Compliance/CompliancePage.tsx` | Replace all UI strings with `t.compliance.*` |
+| `src/pages/RagChat/RagChatPage.tsx` | Replace all UI strings with `t.ragchat.*` |
+| `src/pages/Compliance/HistoryRail.tsx` | Replace UI strings with `t.compliance.*` |
+| `src/pages/Compliance/FindingChatDrawer.tsx` | Replace UI strings with `t.compliance.*` |
+
+---
+
+## Non-Goals
+
+- Persistence across sessions (no localStorage for language preference)
+- More than two languages
+- RTL layout support
+- Pluralisation helpers (simple string substitution with `{n}` placeholders is sufficient — callers replace via `t.docs.selectedCount.replace('{n}', String(count))`)
+- Translation of API-returned content, mock data, regulation names, or document file names
+- Date/number formatting localisation
+
+---
+
+## Constraints
+
+- Zero new npm dependencies
+- Follow existing `ThemeContext` pattern exactly
+- Backend comments/docstrings: English only (no backend changes in this feature)
+- Git commits made by the user, never automated