siemens_ragas

Author	SHA1	Message	Date
wangwei	b870ed8730	feat: make contexts optional in /api/score When contexts is absent, metrics that require retrieved_contexts (faithfulness, context_recall, context_precision, noise_sensitivity) are automatically skipped and appear in skipped_metrics. Only answer_relevancy, factual_correctness, semantic_similarity remain computable without contexts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-24 14:42:03 +08:00
wangwei	a781ba1e4a	config: set default judge_model=gpt-5, embedding_model=text-embedding-3-small gpt-5.4/5.5/5.2/5.4-mini/5.4-nano are incompatible with RAGAS 0.4.3 because they require max_completion_tokens instead of max_tokens. gpt-5 / gpt-4.1 support max_tokens and json_object mode required by RAGAS. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-23 15:29:01 +08:00
wangwei	2ad2c1ea9d	docs: update /api/score example to use gpt-5.4 and text-embedding-3-small Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-23 15:11:34 +08:00
wangwei	1304fec1c4	fix: change ScoreRequest json_schema_extra from examples list to example dict Swagger UI Try it out was sending the {summary, value} wrapper as request body instead of just the value contents, causing 422 errors. The 'example' (singular) key is correctly used as the schema-level example by Swagger UI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-23 10:03:46 +08:00
wangwei	761faf9c42	feat: add ScoreRequest/ScoreResponse models and SCORE_API_TOKEN setting Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-22 15:00:05 +08:00
wangwei	835614189e	feat: ScenarioInfo exposes metric_weights and doc_weights from YAML Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-18 17:05:26 +08:00
wangwei	ce0d2291b0	feat: yaml_patcher and ProfileApplyRequest support metric_weights and doc_weights Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-18 17:02:21 +08:00
wangwei	91c0dab4f9	fix(advisor): fix LLM API call, wire advice_markdown to webapp, update .env.example timeouts - llm_analyzer.py: use llm.langchain_llm.ainvoke() (correct RAGAS 0.4.3 API) - webapp/models.py: add advice_markdown field to ReportData - webapp/services/run_reader.py: add read_advice_markdown() reading optimization_advice.md - webapp/services/report_builder.py: pass advice_markdown into ReportData - .env.example: OPENAI_TIMEOUT_SECONDS 30→180, RAGAS_METRIC_TIMEOUT_SECONDS 45→300 Co-Authored-By: Claude <noreply@anthropic.com>	2026-06-16 17:12:32 +08:00
wangwei	b98af29449	feat: add LLMProfile pydantic models	2026-06-16 16:10:37 +08:00
wangwei	e89695e490	Add RAGAS evaluation web console (FastAPI + vanilla JS) - webapp/: FastAPI backend with runs/scenarios/evaluations API routers; services for run_reader, report_builder, scenario_scanner, task_manager (lazy ragas import — server boots even without ragas); Pydantic models - webapp/static/: single-page console (layout A: left-nav + main area); report detail with metric cards, Chart.js distribution histogram, grouping table, lowest-score sample review; trigger evaluation + log polling - webmain.py: uvicorn entry point (alongside existing main.py CLI) - start.bat: Windows one-click launcher with env checks and auto-browser open - rag_eval/datasets/: implement missing loader + normalizer modules (load_dataset_records, normalize_records) required by evaluator - scripts/seed_sample_run.py: generate realistic demo run artifacts - .gitignore: exclude datasets/ data files but keep rag_eval/datasets/ source Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>	2026-06-15 15:53:57 +08:00

10 Commits