Commit Graph

6 Commits

Author SHA1 Message Date
wangwei
f8e308b7dc fix: use max_tokens=8 for chat model connectivity test
max_tokens=1 triggers 'min-output limit' errors on gpt-5.x models.
Using 8 tokens is still cheap but satisfies all known model minimums.
Falls back to max_completion_tokens=8 if max_tokens is not supported.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-23 15:03:27 +08:00
wangwei
fb420656ec fix: use /embeddings endpoint for embedding models in connectivity test
text-embedding-* and other embedding models must call /embeddings not
/chat/completions. Added _is_embedding_model() heuristic that checks model
name keywords to route to the correct endpoint automatically.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-23 14:53:32 +08:00
wangwei
05419db1f9 fix: support max_completion_tokens for newer models (gpt-5.x) in connectivity test
Newer OpenAI models (gpt-5.4 etc.) reject max_tokens and require
max_completion_tokens. Try max_completion_tokens first, fall back to
max_tokens for older models / compatible APIs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-23 14:51:28 +08:00
wangwei
ac410e7a5d feat: add detailed logging to all API routes and global access log middleware
Each API module now logs:
- evaluations: trigger (scenario path, task_id), status polls, list
- runs: list (count), detail (run_id, metrics, sample counts)
- scenarios: list (total, valid, error counts)
- pipeline: submit (docs_path, models, max_docs), status polls, list
- llm_profiles: CRUD ops (name, model, id), probe/test (model, ok, latency), apply (patched fields)
- score: already had per-request logging

Global middleware (webapp.access logger):
- Every API request: METHOD path -> status (latency_ms) at INFO
- Static file requests demoted to DEBUG to reduce noise

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-23 10:35:00 +08:00
wangwei
ce0d2291b0 feat: yaml_patcher and ProfileApplyRequest support metric_weights and doc_weights
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-18 17:02:21 +08:00
wangwei
b19054bd66 feat: add /api/llm-profiles CRUD router 2026-06-16 16:18:40 +08:00