max_tokens=1 triggers 'min-output limit' errors on gpt-5.x models.
Using 8 tokens is still cheap but satisfies all known model minimums.
Falls back to max_completion_tokens=8 if max_tokens is not supported.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
text-embedding-* and other embedding models must call /embeddings not
/chat/completions. Added _is_embedding_model() heuristic that checks model
name keywords to route to the correct endpoint automatically.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Frontend test functionality was implemented but never committed to git.
Re-adds:
- profiles.js: testCard(), testForm(), _showTestResult(), test btn in renderCard
- api.js: testProfile(id) and probeConnectivity(body) methods
- index.html: 测试连通性 button + result div in profile form
- app.css: .btn-test and .profile-test-result styles
Backend /probe and /{id}/test endpoints were already present in llm_profiles.py.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
#view-apidocs has 'display: flex' in CSS which overrides the browser's
default '[hidden] { display: none }' user-agent style, causing the API
docs iframe to remain visible and bleed into the LLM config page.
Fix: add explicit '#view-apidocs[hidden] { display: none }' rule.
Also exclude apidocs from @media print to prevent iframe printing.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Newer setuptools (Linux) raises error when multiple top-level dirs are found.
Explicitly include only rag_eval/apps/webapp and exclude runtime data dirs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Prevents CRLF line endings on .sh files which cause '/usr/bin/env: bash\r'
errors when running on Linux.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each API module now logs:
- evaluations: trigger (scenario path, task_id), status polls, list
- runs: list (count), detail (run_id, metrics, sample counts)
- scenarios: list (total, valid, error counts)
- pipeline: submit (docs_path, models, max_docs), status polls, list
- llm_profiles: CRUD ops (name, model, id), probe/test (model, ok, latency), apply (patched fields)
- score: already had per-request logging
Global middleware (webapp.access logger):
- Every API request: METHOD path -> status (latency_ms) at INFO
- Static file requests demoted to DEBUG to reduce noise
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Swagger UI Try it out was sending the {summary, value} wrapper as request body
instead of just the value contents, causing 422 errors. The 'example' (singular)
key is correctly used as the schema-level example by Swagger UI.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Log incoming request (client, content-type, metrics, has_gt) on each /api/score call
- Log scoring result (latency, skipped metrics, scores) on success
- Register global RequestValidationError handler: logs url/content-type/errors
so 422 causes are visible in server log without checking HTTP response body
- Fix jsonable_encoder for exc.errors() to handle non-serializable ctx objects
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add detailed Chinese route docstring covering all 7 metrics, contexts format,
ground_truth optional behavior, and Bearer auth instructions
- Add 200 response content example for Swagger UI Try-it-out
- Bump app version to 0.3.0
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- app.js: hash-based router (#runs / #new / #profiles / #report/{runId})
- navigate() pushes history entries for back/forward support
- _restoreSession() reads hash on load and popstate
- sessionStorage fallback for same-tab refreshes
- run-card highlights selected run (.run-card.selected)
- runner.js: use App.navigate() for report redirect; persist lastRunId to sessionStorage
- index.html: report nav button starts disabled (enabled on run select/restore)
- app.css: .run-card.selected with petrol border + ring
Co-Authored-By: Claude <noreply@anthropic.com>