Files
AIRegulation-DocAnalysis/docs/superpowers/plans/2026-05-21-system-status-optimizations.md
wangwei dcda7e0423 @
chore: delete old layout/common/tabs components before redesign
@
2026-06-03 16:58:35 +08:00

20 KiB
Raw Permalink Blame History

System Status Module Optimization Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Upgrade the 系统状态 page from a static one-shot snapshot into a reliable observability dashboard with loading states, service health checks, auto-refresh, BM25/Reranker visibility, and config display fixes.

Architecture: Backend adds a unified /status/health aggregate endpoint (Milvus + MinIO + BM25 + Reranker + Sessions) and a TTL cache on /status/stats. Frontend adds loading/error states, a refresh button, auto-polling while documents are processing, a health services panel, and config value truncation fixes.

Tech Stack: FastAPI, Python 3.11, React 18, TypeScript, CSS-in-JS (inline styles with theme context)


File Map

File Action Purpose
backend/app/api/routes/status.py Modify Add /health endpoint, TTL cache on /stats, import session store
frontend/src/api/status.ts Modify Add SystemHealth type + getSystemHealth()
frontend/src/api/index.ts Modify Export SystemHealth interface
frontend/src/pages/Status/StatusPage.tsx Modify Loading/error states, refresh, auto-poll, health panel, config fix

Task 1: Backend — Add /status/health Endpoint + TTL Cache on /status/stats

Files:

  • Modify: backend/app/api/routes/status.py

  • Step 1: Read the current status.py

File: backend/app/api/routes/status.py

Note existing imports and route signatures before editing.

  • Step 2: Replace status.py with the new version

Replace the entire content of backend/app/api/routes/status.py with:

"""Define API routes for status."""

import time
from typing import Any

from fastapi import APIRouter

from app.config.settings import settings
from app.shared.bootstrap import (
    get_bm25_retriever,
    get_binary_store,
    get_conversation_store,
    get_document_query_service,
    get_vector_index,
)

router = APIRouter(prefix="/status", tags=["系统状态"])

# ---------------------------------------------------------------------------
# Simple TTL cache for /stats (avoids O(N) doc scan on every request)
# ---------------------------------------------------------------------------
_stats_cache: dict[str, Any] = {}
_stats_cache_time: float = 0.0
_STATS_TTL_SECONDS: float = 10.0


def _invalidate_stats_cache() -> None:
    global _stats_cache_time
    _stats_cache_time = 0.0


@router.get("/stats")
async def get_stats():
    """Return document statistics (cached for 10 s)."""
    global _stats_cache, _stats_cache_time
    now = time.time()
    if _stats_cache and (now - _stats_cache_time) < _STATS_TTL_SECONDS:
        return _stats_cache

    documents = get_document_query_service().list_documents()
    indexed = sum(1 for d in documents if d.status.value == "indexed")
    failed = sum(1 for d in documents if d.status.value == "failed")
    _stats_cache = {
        "documents_total": len(documents),
        "documents_indexed": indexed,
        "documents_failed": failed,
        "chunks_total": sum(d.chunk_count for d in documents),
    }
    _stats_cache_time = now
    return _stats_cache


@router.get("/config")
async def get_config():
    """Return system configuration."""
    return {
        "embedding_model": settings.embedding_model,
        "embedding_dim": settings.embedding_dim,
        "embedding_base_url": settings.embedding_base_url,
        "milvus_collection": settings.milvus_collection,
        "parser_backend": settings.parser_backend,
        "chunk_backend": settings.chunk_backend,
        "artifact_prefix": settings.document_parse_artifact_prefix,
        "parser_failure_mode": settings.parser_failure_mode,
        "llm_provider": settings.llm_provider,
        "llm_model": settings.llm_model,
        "document_metadata_path": settings.document_metadata_path,
    }


@router.get("/milvus/health")
async def milvus_health():
    """Return Milvus health (kept for backwards compat)."""
    return get_vector_index().health()


@router.get("/health")
async def get_health():
    """Return aggregate health of all backend services."""
    # --- Milvus ---
    try:
        milvus_info = get_vector_index().health()
        milvus_status = "ok" if milvus_info.get("connected") else "error"
    except Exception as exc:  # noqa: BLE001
        milvus_info = {}
        milvus_status = "error"
        milvus_info["error"] = str(exc)

    # --- MinIO ---
    try:
        minio_connected = get_binary_store().client.connected
        minio_status = "ok" if minio_connected else "error"
    except Exception as exc:  # noqa: BLE001
        minio_status = "error"
        minio_connected = False

    # --- BM25 ---
    bm25 = get_bm25_retriever()

    # --- Sessions ---
    try:
        session_count = len(get_conversation_store().list_sessions())
    except Exception:  # noqa: BLE001
        session_count = 0

    return {
        "milvus": {"status": milvus_status, **milvus_info},
        "minio": {"status": minio_status, "connected": minio_connected},
        "bm25": {"available": bm25 is not None},
        "reranker": {
            "enabled": settings.reranker_enabled,
            "model": settings.reranker_model if settings.reranker_enabled else None,
        },
        "sessions": {
            "active": session_count,
            "max": settings.session_max_sessions,
        },
    }
  • Step 3: Verify Python syntax
cd backend && python -c "from app.api.routes.status import router; print('OK')"

Expected output: OK

  • Step 4: Commit
git add backend/app/api/routes/status.py
git commit -m "feat(status): add /health aggregate endpoint and 10s TTL cache on /stats"

Task 2: Frontend API — Add SystemHealth Type + getSystemHealth()

Files:

  • Modify: frontend/src/api/index.ts

  • Modify: frontend/src/api/status.ts

  • Step 1: Add SystemHealth interface to frontend/src/api/index.ts

Open frontend/src/api/index.ts. After the SystemConfig interface (around line 240), add:

export interface ServiceHealth {
  status: 'ok' | 'error' | 'unknown';
  error?: string;
}

export interface SystemHealth {
  milvus: ServiceHealth & {
    connected?: boolean;
    collection_name?: string;
    num_entities?: number;
  };
  minio: ServiceHealth & { connected?: boolean };
  bm25: { available: boolean };
  reranker: { enabled: boolean; model?: string | null };
  sessions: { active: number; max: number };
}
  • Step 2: Add getSystemHealth() to frontend/src/api/status.ts

Open frontend/src/api/status.ts. The current content is:

import { fetchAPI, type SystemConfig, type SystemStats } from './index';

export async function getSystemStats(): Promise<SystemStats> {
  return fetchAPI<SystemStats>('/status/stats');
}

export async function getSystemConfig(): Promise<SystemConfig> {
  return fetchAPI<SystemConfig>('/status/config');
}

export async function getMilvusHealth(): Promise<{ connected: boolean; collections: string[] }> {
  return fetchAPI('/status/milvus/health');
}

export type { SystemConfig, SystemStats };

Replace with:

import { fetchAPI, type SystemConfig, type SystemHealth, type SystemStats } from './index';

export async function getSystemStats(): Promise<SystemStats> {
  return fetchAPI<SystemStats>('/status/stats');
}

export async function getSystemConfig(): Promise<SystemConfig> {
  return fetchAPI<SystemConfig>('/status/config');
}

export async function getSystemHealth(): Promise<SystemHealth> {
  return fetchAPI<SystemHealth>('/status/health');
}

export type { SystemConfig, SystemHealth, SystemStats };
  • Step 3: Commit
git add frontend/src/api/index.ts frontend/src/api/status.ts
git commit -m "feat(status): add SystemHealth type and getSystemHealth() API function"

Task 3: Frontend — Loading/Error States + Refresh Button + Auto-Poll

Files:

  • Modify: frontend/src/pages/Status/StatusPage.tsx

This task replaces the useEffect + loadData pattern and the top of the component with loading/error/refresh support. Auto-poll fires every 5 s while any document is parsing or pending.

  • Step 1: Read current StatusPage.tsx
File: frontend/src/pages/Status/StatusPage.tsx

Identify: the useState block, loadData function, and the single useEffect.

  • Step 2: Replace the import block and state/effect section

Find and replace the imports at the top of the file:

Old:

import React, { useEffect, useState } from 'react';

New:

import React, { useCallback, useEffect, useState } from 'react';
  • Step 3: Add SystemHealth to the API import

Old:

import { getSystemStats, getSystemConfig, type SystemStats, type SystemConfig } from '../../api/status';

New:

import { getSystemStats, getSystemConfig, getSystemHealth, type SystemStats, type SystemConfig, type SystemHealth } from '../../api/status';
  • Step 4: Replace the state declarations and loadData + useEffect block inside the component

Find the block starting with const [stats, setStats] and ending after the closing }, []); of the first useEffect. Replace it entirely with:

  const [stats, setStats] = useState<SystemStats>({
    documents_total: 0,
    documents_indexed: 0,
    documents_failed: 0,
    chunks_total: 0,
  });
  const [config, setConfig] = useState<SystemConfig | null>(null);
  const [docs, setDocs] = useState<DocInfo[]>([]);
  const [health, setHealth] = useState<SystemHealth | null>(null);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState<string | null>(null);

  const loadData = useCallback(async () => {
    setLoading(true);
    setError(null);
    try {
      const [statsRes, configRes, docsRes, healthRes] = await Promise.all([
        getSystemStats(),
        getSystemConfig(),
        getDocumentList(),
        getSystemHealth(),
      ]);
      setStats(statsRes);
      setConfig(configRes);
      setDocs(docsRes.docs);
      setHealth(healthRes);
    } catch (err) {
      setError(err instanceof Error ? err.message : 'Failed to load status data');
    } finally {
      setLoading(false);
    }
  }, []);

  // Initial load
  useEffect(() => {
    void loadData();
  }, [loadData]);

  // Auto-poll every 5 s while any document is still processing
  useEffect(() => {
    const hasProcessing = docs.some(d => d.status === 'parsing' || d.status === 'pending');
    if (!hasProcessing) return;
    const id = window.setInterval(() => void loadData(), 5000);
    return () => window.clearInterval(id);
  }, [docs, loadData]);
  • Step 5: Add loading banner and error banner at the top of the returned JSX

Find the return ( statement in the component. Inside <Content>, right after <TPattern />, add:

      {/* Loading overlay */}
      {loading && (
        <div style={{ display: 'flex', alignItems: 'center', gap: 8, marginBottom: 16, color: theme.text3, fontSize: 13 }}>
          <span style={{ display: 'inline-block', width: 12, height: 12, borderRadius: '50%', border: `2px solid ${theme.accent}`, borderTopColor: 'transparent', animation: 'spin 0.8s linear infinite' }} />
          <span className="mono">LOADING...</span>
        </div>
      )}

      {/* Error banner */}
      {error && (
        <div style={{
          marginBottom: 16,
          padding: '12px 16px',
          background: '#d6454520',
          border: '1px solid #d64545',
          borderRadius: 8,
          display: 'flex',
          alignItems: 'center',
          justifyContent: 'space-between',
        }}>
          <span style={{ fontSize: 13, color: '#d64545' }}>{error}</span>
          <button
            onClick={() => void loadData()}
            style={{ background: 'none', border: '1px solid #d64545', borderRadius: 6, color: '#d64545', cursor: 'pointer', padding: '4px 10px', fontSize: 12 }}
          >
            重试
          </button>
        </div>
      )}
  • Step 6: Add a refresh button to the stats section header

Find the <section style={{ marginBottom: 48 }}> that wraps the 4 stats cards. Add a header row with a refresh button just before the grid <div>:

      <section style={{ marginBottom: 48 }}>
        <div style={{ display: 'flex', alignItems: 'center', justifyContent: 'space-between', marginBottom: 16 }}>
          <h2 style={{ fontSize: 14, fontWeight: 600, color: theme.accent, letterSpacing: '1px', margin: 0 }}>
            DOCUMENT STATISTICS
          </h2>
          <button
            onClick={() => void loadData()}
            disabled={loading}
            style={{
              background: 'none',
              border: `1px solid ${theme.border}`,
              borderRadius: 6,
              color: theme.text3,
              cursor: loading ? 'not-allowed' : 'pointer',
              padding: '4px 12px',
              fontSize: 11,
              opacity: loading ? 0.5 : 1,
            }}
          >
             刷新
          </button>
        </div>
        {/* existing 4-card grid stays here unchanged */}
  • Step 7: Add CSS keyframe for the spinner

Find the <Content> wrapping element. Add a <style> tag as the very first child inside <Content>:

      <style>{`@keyframes spin { to { transform: rotate(360deg); } }`}</style>
  • Step 8: Commit
git add frontend/src/pages/Status/StatusPage.tsx
git commit -m "feat(status): add loading/error states, refresh button, auto-poll for processing docs"

Task 4: Frontend — Service Health Panel

Files:

  • Modify: frontend/src/pages/Status/StatusPage.tsx

Adds a new section after the stats cards that shows a colored status badge for each service (Milvus, MinIO, BM25, Reranker) plus the active session count.

  • Step 1: Add the ServiceBadge helper component (add just before StatsCard)
const ServiceBadge = ({
  label,
  status,
  detail,
}: {
  label: string;
  status: 'ok' | 'error' | 'unknown' | boolean;
  detail?: string;
}) => {
  const { theme } = useTheme();
  const isOk = status === 'ok' || status === true;
  const color = isOk ? theme.green : '#d64545';
  const dot = isOk ? '●' : '●';
  return (
    <div style={{
      display: 'flex',
      alignItems: 'center',
      justifyContent: 'space-between',
      padding: '12px 16px',
      background: theme.bgCard,
      borderRadius: 10,
      border: `1px solid ${theme.border}`,
    }}>
      <div style={{ display: 'flex', alignItems: 'center', gap: 8 }}>
        <span style={{ color, fontSize: 10 }}>{dot}</span>
        <span className="mono" style={{ fontSize: 12, color: theme.text2 }}>{label}</span>
      </div>
      <span style={{ fontSize: 12, color, fontWeight: 600 }}>
        {detail ?? (isOk ? 'OK' : 'ERROR')}
      </span>
    </div>
  );
};
  • Step 2: Insert the health section between the stats section and the SYSTEM CONFIGURATION section

Find <section style={{ marginBottom: 48 }}> for SYSTEM CONFIGURATION. Just before it, insert:

      <section style={{ marginBottom: 48 }}>
        <h2 style={{ fontSize: 14, fontWeight: 600, color: theme.accent, marginBottom: 20, letterSpacing: '1px' }}>
          SERVICE HEALTH
        </h2>
        <div style={{ display: 'grid', gridTemplateColumns: 'repeat(3, 1fr)', gap: 12, marginBottom: 12 }}>
          <ServiceBadge
            label="MILVUS"
            status={health?.milvus.status ?? 'unknown'}
            detail={health ? (health.milvus.status === 'ok' ? `${health.milvus.num_entities ?? 0} entities` : 'disconnected') : '—'}
          />
          <ServiceBadge
            label="MINIO"
            status={health?.minio.status ?? 'unknown'}
            detail={health ? (health.minio.status === 'ok' ? 'connected' : 'disconnected') : '—'}
          />
          <ServiceBadge
            label="BM25 HYBRID"
            status={health?.bm25.available ?? false}
            detail={health ? (health.bm25.available ? 'enabled' : 'unavailable') : '—'}
          />
        </div>
        <div style={{ display: 'grid', gridTemplateColumns: 'repeat(3, 1fr)', gap: 12 }}>
          <ServiceBadge
            label="RERANKER"
            status={health?.reranker.enabled ?? false}
            detail={health ? (health.reranker.enabled ? health.reranker.model ?? 'enabled' : 'disabled') : '—'}
          />
          <ServiceBadge
            label="SESSIONS"
            status="ok"
            detail={health ? `${health.sessions.active} / ${health.sessions.max}` : '—'}
          />
          <ServiceBadge
            label="LLM"
            status="ok"
            detail={config ? `${config.llm_provider} · ${config.llm_model}` : '—'}
          />
        </div>
      </section>
  • Step 3: Commit
git add frontend/src/pages/Status/StatusPage.tsx
git commit -m "feat(status): add SERVICE HEALTH panel with Milvus/MinIO/BM25/Reranker/Session badges"

Task 5: Frontend — Config Value Truncation Fix + Failed Docs Highlight

Files:

  • Modify: frontend/src/pages/Status/StatusPage.tsx

  • Step 1: Fix config value truncation

In the SYSTEM CONFIGURATION section, find the <span> that renders the config value v — it currently renders as:

<span style={{ fontSize: 14, fontWeight: 500 }}>{v}</span>

Replace it with:

<span
  title={v}
  style={{
    fontSize: 13,
    fontWeight: 500,
    maxWidth: 240,
    overflow: 'hidden',
    textOverflow: 'ellipsis',
    whiteSpace: 'nowrap',
    cursor: 'help',
  }}
>
  {v}
</span>

This applies to both the MODELS and STORAGE AND PATHS sub-sections since they share the same .map(([k, v]) => ...) render pattern.

  • Step 2: Highlight failed documents in the document list

In the DOCUMENT INDEX section, find the docs.map(d => ...) container <div>. Change the border to highlight failed docs:

Old:

border: `1px solid ${theme.border}`,

New:

border: `1px solid ${d.status === 'failed' ? '#d64545' : d.status === 'parsing' || d.status === 'pending' ? theme.accent + '80' : theme.border}`,
  • Step 3: Add pulsing indicator for in-progress documents

In the same docs.map block, the status badge currently always shows a static green background. Update it to show different colors per status:

Old:

<div style={{
  padding: '4px 12px',
  background: d.status === 'failed' ? '#d64545' : theme.green,
  borderRadius: 6,
}}>
  <span className="mono" style={{ fontSize: 10, fontWeight: 600, color: '#fff' }}>
    {d.status.toUpperCase()}
  </span>
</div>

New:

<div style={{
  padding: '4px 12px',
  background:
    d.status === 'failed' ? '#d64545' :
    d.status === 'parsing' || d.status === 'pending' ? theme.accent :
    theme.green,
  borderRadius: 6,
  opacity: d.status === 'parsing' || d.status === 'pending' ? 0.85 : 1,
}}>
  <span className="mono" style={{ fontSize: 10, fontWeight: 600, color: '#fff' }}>
    {d.status === 'parsing' ? '⟳ ' : ''}{d.status.toUpperCase()}
  </span>
</div>
  • Step 4: Commit
git add frontend/src/pages/Status/StatusPage.tsx
git commit -m "fix(status): config value ellipsis truncation, failed doc highlight, parsing doc pulse"

Self-Review Checklist

  • Task 1 covers P2 (TTL cache) + P1 (service health backend)
  • Task 2 adds the SystemHealth type that Tasks 34 depend on
  • Task 3 covers P0 (loading/error/refresh/auto-poll)
  • Task 4 covers P1 (BM25/reranker/sessions visibility)
  • Task 5 covers P1 (config truncation) + P1 (failed doc visual)
  • All getSystemHealth references match the function name defined in Task 2
  • health?.milvus.status type matches ServiceHealth.status: 'ok' | 'error' | 'unknown'
  • Backend /health endpoint uses only already-imported bootstrap functions
  • No TBD or TODO left in plan