Add RAGAS evaluation web console (FastAPI + vanilla JS)

- webapp/: FastAPI backend with runs/scenarios/evaluations API routers;
  services for run_reader, report_builder, scenario_scanner, task_manager
  (lazy ragas import — server boots even without ragas); Pydantic models
- webapp/static/: single-page console (layout A: left-nav + main area);
  report detail with metric cards, Chart.js distribution histogram,
  grouping table, lowest-score sample review; trigger evaluation + log polling
- webmain.py: uvicorn entry point (alongside existing main.py CLI)
- start.bat: Windows one-click launcher with env checks and auto-browser open
- rag_eval/datasets/: implement missing loader + normalizer modules
  (load_dataset_records, normalize_records) required by evaluator
- scripts/seed_sample_run.py: generate realistic demo run artifacts
- .gitignore: exclude datasets/ data files but keep rag_eval/datasets/ source

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
This commit is contained in:
2026-06-15 15:53:57 +08:00
parent 9cbdc1d95d
commit e89695e490
26 changed files with 2496 additions and 2 deletions

118
webapp/static/index.html Normal file
View File

@@ -0,0 +1,118 @@
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Siemens RAGAS 评估控制台</title>
<link rel="stylesheet" href="/static/css/app.css" />
<script src="https://cdn.jsdelivr.net/npm/chart.js@4.4.1/dist/chart.umd.min.js"></script>
</head>
<body>
<div class="app">
<!-- 左侧导航(布局 A -->
<aside class="sidebar">
<div class="brand">
<div class="brand-mark">RAGAS</div>
<div class="brand-sub">评估控制台</div>
</div>
<nav class="nav">
<button class="nav-item" data-view="runs">
<span class="nav-ico"></span><span>运行列表</span>
</button>
<button class="nav-item" data-view="new">
<span class="nav-ico"></span><span>新建评估</span>
</button>
<button class="nav-item" data-view="report" data-requires-run="1">
<span class="nav-ico"></span><span>报告详情</span>
</button>
</nav>
<div class="sidebar-foot">
<span class="dot" id="health-dot"></span>
<span id="health-text">连接中…</span>
</div>
</aside>
<!-- 主内容区 -->
<main class="main">
<header class="topbar">
<h1 id="view-title">运行列表</h1>
<button class="btn btn-ghost" id="refresh-btn">刷新</button>
</header>
<!-- 运行列表视图 -->
<section class="view" id="view-runs">
<div id="runs-container" class="runs-grid"></div>
<div class="empty" id="runs-empty" hidden>
<p>暂无评估运行。</p>
<p class="muted">从「新建评估」触发一次,或运行示例数据生成脚本:<code>python scripts/seed_sample_run.py</code></p>
</div>
</section>
<!-- 新建评估视图 -->
<section class="view" id="view-new" hidden>
<div class="panel">
<h2>选择场景并运行</h2>
<p class="muted"><code>scenarios/</code> 下选择一个场景配置,点击运行后在下方查看实时状态与日志。</p>
<div class="scenario-list" id="scenario-list"></div>
<div class="run-actions">
<button class="btn btn-primary" id="run-btn" disabled>运行评估</button>
<span class="selected-scenario muted" id="selected-scenario">未选择场景</span>
</div>
</div>
<div class="panel" id="task-panel" hidden>
<div class="task-head">
<h2>评估进度</h2>
<span class="badge" id="task-status">queued</span>
</div>
<pre class="log-box" id="task-log"></pre>
<div class="task-actions">
<button class="btn btn-primary" id="view-report-btn" hidden>查看报告</button>
</div>
</div>
</section>
<!-- 报告详情视图 -->
<section class="view" id="view-report" hidden>
<div class="empty" id="report-empty">
<p>请先从「运行列表」选择一次运行。</p>
</div>
<div id="report-content" hidden>
<!-- 顶部元信息条 -->
<div class="report-meta" id="report-meta"></div>
<!-- ① 指标均值卡片 -->
<div class="section-label">① 指标均值 OVERVIEW</div>
<div class="metric-cards" id="metric-cards"></div>
<!-- ② 分布 + ③ 分组 并排 -->
<div class="report-row">
<div class="panel report-half">
<div class="panel-head">
<div class="section-label tight">② 分数分布</div>
<select id="dist-metric-select" class="select"></select>
</div>
<canvas id="dist-chart" height="160"></canvas>
<p class="muted tiny">暴露长尾失败样本</p>
</div>
<div class="panel report-half">
<div class="section-label tight">③ 分组均值</div>
<div id="grouping-tabs" class="grouping-tabs"></div>
<div id="grouping-table"></div>
<p class="muted tiny">定位薄弱类别</p>
</div>
</div>
<!-- ④ 最低分样本逐条复核 -->
<div class="section-label">④ 最低分样本(点击展开逐条复核)</div>
<div class="lowest-table" id="lowest-table"></div>
</div>
</section>
</main>
</div>
<script src="/static/js/api.js"></script>
<script src="/static/js/report.js"></script>
<script src="/static/js/runner.js"></script>
<script src="/static/js/app.js"></script>
</body>
</html>