Files
wangwei 754a30ad59 feat(session-async): add /api/score/session_async with incremental session report aggregation
- New POST /api/score/session_async endpoint: same session_id calls append to one shared report
- New GET /api/score/sessions/{session_id}: returns call_count, metric_means, all job records
- New GET /api/score/session/jobs/{job_id}: individual call status
- SessionScoreJobManager: deterministic run_id from session_id, per-session mutex for CSV append, advisor regenerated on every call
- SessionScoreRequest (extends ScoreRequest + session_id), SessionScoreJobResponse, SessionStatus models added
- 24 new tests, all passing

chore(weighted-score): comment out 综合加权得分 display and computation

- report.js: hide 综合加权得分 card in report detail page
- score_jobs.js: hide 综合 chip in async job list
- report_builder.py: overall_ws=None (computation disabled)
- summary.py: weighted_score summary line disabled
- evaluator.py: weighted_score/sample_weight columns no longer written to scores.csv
- score.py /api/score: weighted_score always returns null
- score_job_manager.py + session_score_manager.py: weighted=None
- Updated 3 tests to match new behaviour (6 pre-existing failures unchanged)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-26 16:09:33 +08:00

54 lines
3.9 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<h2>优化顾问模块 — 整体架构与数据流</h2>
<p class="subtitle">新增 rag_eval/advisor/ 包,插入 run_scenario() 末尾,复用已有 llm 实例</p>
<div class="mockup">
<div class="mockup-header">执行链路(变更前 → 变更后)</div>
<div class="mockup-body" style="font-family:monospace;font-size:13px;line-height:2">
<span style="color:#94a3b8">run_scenario()</span><br>
&nbsp;&nbsp;→ load_scenario()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8"># 读 YAML解析 Scenario + optimization_advisor 字段</span><br>
&nbsp;&nbsp;→ build_models()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8"># 已有:创建 llm, embeddings</span><br>
&nbsp;&nbsp;→ build_metric_pipeline()&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8"># 已有</span><br>
&nbsp;&nbsp;→ Evaluator.evaluate()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8"># 已有:打分 → EvaluationResult</span><br>
&nbsp;&nbsp;→ write_run_artifacts()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8"># 已有scores.csv / summary.md / ...</span><br>
&nbsp;&nbsp;<span style="color:#4ade80;font-weight:bold">→ run_advisor(result, scenario, llm)&nbsp;&nbsp;&nbsp;# 新增 3 行</span><br>
&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#4ade80">&nbsp;&nbsp;→ rules.diagnose(score_rows)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# 规则引擎:识别异常指标 + 方向</span><br>
&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#4ade80">&nbsp;&nbsp;→ llm_analyzer.analyze(diag, samples)&nbsp;# LLM结合低分样本生成中文建议</span><br>
&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#4ade80">&nbsp;&nbsp;→ writer.write(advice, paths)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# 写 optimization_advice.md + 日志</span>
</div>
</div>
<div class="section">
<h3>新增文件一览</h3>
<div class="mockup">
<div class="mockup-body" style="font-family:monospace;font-size:13px;line-height:1.9">
rag_eval/advisor/<br>
&nbsp;&nbsp;__init__.py&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8">← 暴露 run_advisor(),是外部唯一入口</span><br>
&nbsp;&nbsp;rules.py&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8">← 纯函数,无 LLM可单独单测</span><br>
&nbsp;&nbsp;llm_analyzer.py <span style="color:#94a3b8">← 接收 llm 实例 + 诊断结构 → 中文 Markdown</span><br>
&nbsp;&nbsp;writer.py&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color:#94a3b8">← 写 optimization_advice.md打日志摘要</span><br>
<br>
rag_eval/shared/models.py&nbsp;&nbsp;&nbsp;<span style="color:#fbbf24">← 修改Scenario 加 optimization_advisor 字段</span><br>
rag_eval/config/schema.py&nbsp;&nbsp;&nbsp;<span style="color:#fbbf24">← 修改ScenarioModel 加字段</span><br>
rag_eval/execution/runner.py&nbsp;<span style="color:#fbbf24">← 修改:末尾加 3 行调用</span><br>
rag_eval/reporting/artifacts.py <span style="color:#fbbf24">← 修改RunArtifactPaths 加 advice_md 路径</span>
</div>
</div>
</div>
<div class="section">
<h3>输出产物</h3>
<div class="mockup">
<div class="mockup-body" style="font-family:monospace;font-size:13px;line-height:1.9">
outputs/online/siemens-pdf-question-bank/&lt;run_id&gt;/<br>
&nbsp;&nbsp;scenario.snapshot.yaml<br>
&nbsp;&nbsp;scores.csv<br>
&nbsp;&nbsp;invalid.csv<br>
&nbsp;&nbsp;summary.md<br>
&nbsp;&nbsp;metadata.json<br>
&nbsp;&nbsp;<span style="color:#4ade80;font-weight:bold">optimization_advice.md&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;← 新增</span>
</div>
</div>
</div>
<p style="margin-top:1rem;color:#94a3b8;font-size:13px">整体看起来 OK 吗?这是新模块与现有链路的接入方式。</p>