# System Status Module Optimization Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Upgrade the 系统状态 page from a static one-shot snapshot into a reliable observability dashboard with loading states, service health checks, auto-refresh, BM25/Reranker visibility, and config display fixes. **Architecture:** Backend adds a unified `/status/health` aggregate endpoint (Milvus + MinIO + BM25 + Reranker + Sessions) and a TTL cache on `/status/stats`. Frontend adds loading/error states, a refresh button, auto-polling while documents are processing, a health services panel, and config value truncation fixes. **Tech Stack:** FastAPI, Python 3.11, React 18, TypeScript, CSS-in-JS (inline styles with theme context) --- ## File Map | File | Action | Purpose | |------|--------|---------| | `backend/app/api/routes/status.py` | Modify | Add `/health` endpoint, TTL cache on `/stats`, import session store | | `frontend/src/api/status.ts` | Modify | Add `SystemHealth` type + `getSystemHealth()` | | `frontend/src/api/index.ts` | Modify | Export `SystemHealth` interface | | `frontend/src/pages/Status/StatusPage.tsx` | Modify | Loading/error states, refresh, auto-poll, health panel, config fix | --- ## Task 1: Backend — Add `/status/health` Endpoint + TTL Cache on `/status/stats` **Files:** - Modify: `backend/app/api/routes/status.py` - [ ] **Step 1: Read the current `status.py`** ``` File: backend/app/api/routes/status.py ``` Note existing imports and route signatures before editing. - [ ] **Step 2: Replace `status.py` with the new version** Replace the entire content of `backend/app/api/routes/status.py` with: ```python """Define API routes for status.""" import time from typing import Any from fastapi import APIRouter from app.config.settings import settings from app.shared.bootstrap import ( get_bm25_retriever, get_binary_store, get_conversation_store, get_document_query_service, get_vector_index, ) router = APIRouter(prefix="/status", tags=["系统状态"]) # --------------------------------------------------------------------------- # Simple TTL cache for /stats (avoids O(N) doc scan on every request) # --------------------------------------------------------------------------- _stats_cache: dict[str, Any] = {} _stats_cache_time: float = 0.0 _STATS_TTL_SECONDS: float = 10.0 def _invalidate_stats_cache() -> None: global _stats_cache_time _stats_cache_time = 0.0 @router.get("/stats") async def get_stats(): """Return document statistics (cached for 10 s).""" global _stats_cache, _stats_cache_time now = time.time() if _stats_cache and (now - _stats_cache_time) < _STATS_TTL_SECONDS: return _stats_cache documents = get_document_query_service().list_documents() indexed = sum(1 for d in documents if d.status.value == "indexed") failed = sum(1 for d in documents if d.status.value == "failed") _stats_cache = { "documents_total": len(documents), "documents_indexed": indexed, "documents_failed": failed, "chunks_total": sum(d.chunk_count for d in documents), } _stats_cache_time = now return _stats_cache @router.get("/config") async def get_config(): """Return system configuration.""" return { "embedding_model": settings.embedding_model, "embedding_dim": settings.embedding_dim, "embedding_base_url": settings.embedding_base_url, "milvus_collection": settings.milvus_collection, "parser_backend": settings.parser_backend, "chunk_backend": settings.chunk_backend, "artifact_prefix": settings.document_parse_artifact_prefix, "parser_failure_mode": settings.parser_failure_mode, "llm_provider": settings.llm_provider, "llm_model": settings.llm_model, "document_metadata_path": settings.document_metadata_path, } @router.get("/milvus/health") async def milvus_health(): """Return Milvus health (kept for backwards compat).""" return get_vector_index().health() @router.get("/health") async def get_health(): """Return aggregate health of all backend services.""" # --- Milvus --- try: milvus_info = get_vector_index().health() milvus_status = "ok" if milvus_info.get("connected") else "error" except Exception as exc: # noqa: BLE001 milvus_info = {} milvus_status = "error" milvus_info["error"] = str(exc) # --- MinIO --- try: minio_connected = get_binary_store().client.connected minio_status = "ok" if minio_connected else "error" except Exception as exc: # noqa: BLE001 minio_status = "error" minio_connected = False # --- BM25 --- bm25 = get_bm25_retriever() # --- Sessions --- try: session_count = len(get_conversation_store().list_sessions()) except Exception: # noqa: BLE001 session_count = 0 return { "milvus": {"status": milvus_status, **milvus_info}, "minio": {"status": minio_status, "connected": minio_connected}, "bm25": {"available": bm25 is not None}, "reranker": { "enabled": settings.reranker_enabled, "model": settings.reranker_model if settings.reranker_enabled else None, }, "sessions": { "active": session_count, "max": settings.session_max_sessions, }, } ``` - [ ] **Step 3: Verify Python syntax** ```bash cd backend && python -c "from app.api.routes.status import router; print('OK')" ``` Expected output: `OK` - [ ] **Step 4: Commit** ```bash git add backend/app/api/routes/status.py git commit -m "feat(status): add /health aggregate endpoint and 10s TTL cache on /stats" ``` --- ## Task 2: Frontend API — Add `SystemHealth` Type + `getSystemHealth()` **Files:** - Modify: `frontend/src/api/index.ts` - Modify: `frontend/src/api/status.ts` - [ ] **Step 1: Add `SystemHealth` interface to `frontend/src/api/index.ts`** Open `frontend/src/api/index.ts`. After the `SystemConfig` interface (around line 240), add: ```typescript export interface ServiceHealth { status: 'ok' | 'error' | 'unknown'; error?: string; } export interface SystemHealth { milvus: ServiceHealth & { connected?: boolean; collection_name?: string; num_entities?: number; }; minio: ServiceHealth & { connected?: boolean }; bm25: { available: boolean }; reranker: { enabled: boolean; model?: string | null }; sessions: { active: number; max: number }; } ``` - [ ] **Step 2: Add `getSystemHealth()` to `frontend/src/api/status.ts`** Open `frontend/src/api/status.ts`. The current content is: ```typescript import { fetchAPI, type SystemConfig, type SystemStats } from './index'; export async function getSystemStats(): Promise { return fetchAPI('/status/stats'); } export async function getSystemConfig(): Promise { return fetchAPI('/status/config'); } export async function getMilvusHealth(): Promise<{ connected: boolean; collections: string[] }> { return fetchAPI('/status/milvus/health'); } export type { SystemConfig, SystemStats }; ``` Replace with: ```typescript import { fetchAPI, type SystemConfig, type SystemHealth, type SystemStats } from './index'; export async function getSystemStats(): Promise { return fetchAPI('/status/stats'); } export async function getSystemConfig(): Promise { return fetchAPI('/status/config'); } export async function getSystemHealth(): Promise { return fetchAPI('/status/health'); } export type { SystemConfig, SystemHealth, SystemStats }; ``` - [ ] **Step 3: Commit** ```bash git add frontend/src/api/index.ts frontend/src/api/status.ts git commit -m "feat(status): add SystemHealth type and getSystemHealth() API function" ``` --- ## Task 3: Frontend — Loading/Error States + Refresh Button + Auto-Poll **Files:** - Modify: `frontend/src/pages/Status/StatusPage.tsx` This task replaces the `useEffect` + `loadData` pattern and the top of the component with loading/error/refresh support. Auto-poll fires every 5 s while any document is `parsing` or `pending`. - [ ] **Step 1: Read current `StatusPage.tsx`** ``` File: frontend/src/pages/Status/StatusPage.tsx ``` Identify: the `useState` block, `loadData` function, and the single `useEffect`. - [ ] **Step 2: Replace the import block and state/effect section** Find and replace the imports at the top of the file: **Old:** ```typescript import React, { useEffect, useState } from 'react'; ``` **New:** ```typescript import React, { useCallback, useEffect, useState } from 'react'; ``` - [ ] **Step 3: Add `SystemHealth` to the API import** **Old:** ```typescript import { getSystemStats, getSystemConfig, type SystemStats, type SystemConfig } from '../../api/status'; ``` **New:** ```typescript import { getSystemStats, getSystemConfig, getSystemHealth, type SystemStats, type SystemConfig, type SystemHealth } from '../../api/status'; ``` - [ ] **Step 4: Replace the state declarations and loadData + useEffect block inside the component** Find the block starting with `const [stats, setStats]` and ending after the closing `}, []);` of the first `useEffect`. Replace it entirely with: ```typescript const [stats, setStats] = useState({ documents_total: 0, documents_indexed: 0, documents_failed: 0, chunks_total: 0, }); const [config, setConfig] = useState(null); const [docs, setDocs] = useState([]); const [health, setHealth] = useState(null); const [loading, setLoading] = useState(true); const [error, setError] = useState(null); const loadData = useCallback(async () => { setLoading(true); setError(null); try { const [statsRes, configRes, docsRes, healthRes] = await Promise.all([ getSystemStats(), getSystemConfig(), getDocumentList(), getSystemHealth(), ]); setStats(statsRes); setConfig(configRes); setDocs(docsRes.docs); setHealth(healthRes); } catch (err) { setError(err instanceof Error ? err.message : 'Failed to load status data'); } finally { setLoading(false); } }, []); // Initial load useEffect(() => { void loadData(); }, [loadData]); // Auto-poll every 5 s while any document is still processing useEffect(() => { const hasProcessing = docs.some(d => d.status === 'parsing' || d.status === 'pending'); if (!hasProcessing) return; const id = window.setInterval(() => void loadData(), 5000); return () => window.clearInterval(id); }, [docs, loadData]); ``` - [ ] **Step 5: Add loading banner and error banner at the top of the returned JSX** Find the `return (` statement in the component. Inside ``, right after ``, add: ```typescript {/* Loading overlay */} {loading && (
LOADING...
)} {/* Error banner */} {error && (
{error}
)} ``` - [ ] **Step 6: Add a refresh button to the stats section header** Find the `
` that wraps the 4 stats cards. Add a header row with a refresh button just before the grid `
`: ```typescript

DOCUMENT STATISTICS

{/* existing 4-card grid stays here unchanged */} ``` - [ ] **Step 7: Add CSS keyframe for the spinner** Find the `` wrapping element. Add a ` ``` - [ ] **Step 8: Commit** ```bash git add frontend/src/pages/Status/StatusPage.tsx git commit -m "feat(status): add loading/error states, refresh button, auto-poll for processing docs" ``` --- ## Task 4: Frontend — Service Health Panel **Files:** - Modify: `frontend/src/pages/Status/StatusPage.tsx` Adds a new section after the stats cards that shows a colored status badge for each service (Milvus, MinIO, BM25, Reranker) plus the active session count. - [ ] **Step 1: Add the `ServiceBadge` helper component** (add just before `StatsCard`) ```typescript const ServiceBadge = ({ label, status, detail, }: { label: string; status: 'ok' | 'error' | 'unknown' | boolean; detail?: string; }) => { const { theme } = useTheme(); const isOk = status === 'ok' || status === true; const color = isOk ? theme.green : '#d64545'; const dot = isOk ? '●' : '●'; return (
{dot} {label}
{detail ?? (isOk ? 'OK' : 'ERROR')}
); }; ``` - [ ] **Step 2: Insert the health section between the stats section and the SYSTEM CONFIGURATION section** Find `
` for SYSTEM CONFIGURATION. Just before it, insert: ```typescript

SERVICE HEALTH

``` - [ ] **Step 3: Commit** ```bash git add frontend/src/pages/Status/StatusPage.tsx git commit -m "feat(status): add SERVICE HEALTH panel with Milvus/MinIO/BM25/Reranker/Session badges" ``` --- ## Task 5: Frontend — Config Value Truncation Fix + Failed Docs Highlight **Files:** - Modify: `frontend/src/pages/Status/StatusPage.tsx` - [ ] **Step 1: Fix config value truncation** In the SYSTEM CONFIGURATION section, find the `` that renders the config value `v` — it currently renders as: ```typescript {v} ``` Replace it with: ```typescript {v} ``` This applies to both the MODELS and STORAGE AND PATHS sub-sections since they share the same `.map(([k, v]) => ...)` render pattern. - [ ] **Step 2: Highlight failed documents in the document list** In the DOCUMENT INDEX section, find the `docs.map(d => ...)` container `
`. Change the border to highlight failed docs: **Old:** ```typescript border: `1px solid ${theme.border}`, ``` **New:** ```typescript border: `1px solid ${d.status === 'failed' ? '#d64545' : d.status === 'parsing' || d.status === 'pending' ? theme.accent + '80' : theme.border}`, ``` - [ ] **Step 3: Add pulsing indicator for in-progress documents** In the same `docs.map` block, the status badge currently always shows a static green background. Update it to show different colors per status: **Old:** ```typescript
{d.status.toUpperCase()}
``` **New:** ```typescript
{d.status === 'parsing' ? '⟳ ' : ''}{d.status.toUpperCase()}
``` - [ ] **Step 4: Commit** ```bash git add frontend/src/pages/Status/StatusPage.tsx git commit -m "fix(status): config value ellipsis truncation, failed doc highlight, parsing doc pulse" ``` --- ## Self-Review Checklist - [x] Task 1 covers P2 (TTL cache) + P1 (service health backend) - [x] Task 2 adds the `SystemHealth` type that Tasks 3–4 depend on - [x] Task 3 covers P0 (loading/error/refresh/auto-poll) - [x] Task 4 covers P1 (BM25/reranker/sessions visibility) - [x] Task 5 covers P1 (config truncation) + P1 (failed doc visual) - [x] All `getSystemHealth` references match the function name defined in Task 2 - [x] `health?.milvus.status` type matches `ServiceHealth.status: 'ok' | 'error' | 'unknown'` - [x] Backend `/health` endpoint uses only already-imported bootstrap functions - [x] No TBD or TODO left in plan