initial commits

This commit is contained in:
2026-06-26 17:02:21 +08:00
commit 2851fa01cf
28 changed files with 2411 additions and 0 deletions

243
docs/AI_NEXUS_CLAUDE.md Normal file
View File

@@ -0,0 +1,243 @@
# AI Nexus Claude Documentation
Source: `AI Nexus Product Documentation _ Models _ Anthropic _ Claude _ One Developer Portal.pdf`
Extracted locally on 2026-06-26.
## Important Notice
AI Nexus temporarily does not support the Anthropic Messages API. Users are asked to wait for updates because providers had to be changed on short notice due to governance regulations. AI Nexus is actively working on enabling the Messages API.
Anthropic may restrict access to services by region. Users or their organizational units must verify whether country-specific access is permitted before using the service.
## Overview
Claude is a family of AI assistants created by Anthropic. It is designed to be helpful, honest, and safe in conversations. Claude can answer questions, write and edit text, summarize documents, and help with coding in a natural chat-like way.
Claude models are available in different sizes:
- A fast lightweight model for simple tasks.
- A balanced model for everyday use.
- A more powerful model for complex reasoning and analysis.
Claude is used in chatbots, research tools, and workplace assistants where reliability and clear, thoughtful responses matter. Claude models are often strong choices for coding tasks.
## Access
Refer to the internal "How to get access to the models" documentation to get access.
The documented endpoint is:
```text
https://genai-nexus.api.corpinter.net
```
The service uses AWS Bedrock Runtime with an internal Nexus endpoint.
## Available Models
| Model | Production |
| --- | --- |
| `claude-sonnet-4.6` | Yes |
| `claude-opus-4.6` | Yes |
| `claude-haiku-4.5` | Yes |
AI Nexus recommends `claude-sonnet-4.6` as the cost-effective default. It largely matches or exceeds `claude-opus-4.6` on most benchmarks at lower cost. Use Opus only when the use case specifically requires Opus-level capabilities.
## Converse API Workaround
AI Nexus uses Anthropic models provided by AWS Bedrock. Because the Anthropic Messages API is not currently supported, AI Nexus recommends using the AWS Converse API.
### Python Text Generation
```python
import boto3
import os
# export AWS_BEARER_TOKEN_BEDROCK=${your-bedrock-api-key}
os.environ["AWS_BEARER_TOKEN_BEDROCK"] = "<nexus-api-key>"
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="https://genai-nexus.api.corpinter.net",
region_name="nexus", # required but internally overridden
)
response = client.converse(
modelId="claude-sonnet-4",
messages=[{"role": "user", "content": [{"text": "Hello"}]}],
)
print(response["output"]["message"]["content"][0]["text"])
```
### HTTP Text Generation
```bash
curl https://genai-nexus.api.corpinter.net/model/<model-id>/converse \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-d '{
"model": "claude-sonnet-4",
"messages": [
{
"role": "user",
"content": [{"text": "Hello, Claude"}]
}
]
}'
```
## Streaming Responses
Streaming returns partial output as soon as tokens or text deltas are produced. This lowers latency to first character and is useful for chat UIs, live drafting, assistants, and long answers.
Core event types:
- `contentBlockDelta`: contains a delta, usually `delta.text`, with newly generated text.
- `messageStop`: signals the end of generation. Inspect `stopReason` if needed.
- `contentBlockStart` / `contentBlockStop`: structural boundaries that can appear with tool use or multimodal output.
- `metadata`: optional interim metadata such as token counts.
- `error`: error event. The caller should abort the current display and handle retry or logging.
### Python Streaming
```python
import boto3
import os
# export AWS_BEARER_TOKEN_BEDROCK=${your-bedrock-api-key}
os.environ["AWS_BEARER_TOKEN_BEDROCK"] = "<nexus-api-key>"
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="https://genai-nexus.api.corpinter.net",
region_name="nexus", # required but internally overridden
)
response = client.converse_stream(
modelId="claude-sonnet-4",
messages=[
{
"role": "user",
"content": [{"text": "What is the meaning of life?"}],
}
],
)
stream = response.get("stream")
collected = []
for event in stream:
if "contentBlockDelta" in event:
delta = event["contentBlockDelta"]["delta"]
text = delta.get("text")
if text:
collected.append(text)
print(text, end="", flush=True)
if "messageStop" in event:
break
```
### HTTP Streaming
Raw HTTP streaming uses chunked transfer. This request initiates streaming generation and receives incremental chunks:
```bash
curl -N \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NEXUS_API_KEY" \
https://genai-nexus.api.corpinter.net/model/<model-id>/converse-stream \
-d '{
"model": "claude-sonnet-4",
"messages": [
{"role": "user", "content": [{"text": "Say hello"}]}
]
}'
```
Simplified request schema:
```http
POST /model/<model-id>/converse-stream HTTP/1.1
Content-Type: application/json
```
## Image Understanding
Images are passed through the `messages` parameter using an image content block.
### HTTP Image Example
```bash
# Base64 encode the image into IMG_B64.
# Use raw bytes, not a data URI.
IMG_B64=$(base64 -i ./image.png | tr -d '\n')
curl \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NEXUS_API_KEY" \
https://genai-nexus.api.corpinter.net/model/<model-id>/converse \
-d '{
"model": "claude-sonnet-4",
"messages": [
{
"role": "user",
"content": [
{"text": "Describe the image"},
{
"image": {
"format": "png",
"source": {"bytes": "'$IMG_B64'"}
}
}
]
}
]
}'
```
### Python Image Example
```python
image_ext = image_filepath.split(".")[-1]
with open(image_filepath, "rb") as f:
image = f.read()
messages = [
{
"role": "user",
"content": [
{"text": "Describe the image"},
{
"image": {
"format": image_ext,
"source": {
"bytes": image,
},
}
},
],
}
]
```
## Implications for `nexus-claude-api`
Claude Code expects Anthropic Messages API behavior, while AI Nexus currently documents Converse API behavior. The local proxy should therefore:
- Expose an Anthropic-compatible `/v1/messages` endpoint locally.
- Translate Anthropic Messages requests to Bedrock Converse requests.
- Translate Bedrock Converse and Converse Stream responses back to Anthropic-compatible responses.
- Use `AWS_BEARER_TOKEN_BEDROCK` or `NEXUS_API_KEY` as the outbound Nexus credential.
- Avoid changing Claude Code workflows or requiring users to call Converse directly.
Claude Code expects Anthropic Messages API behavior, while AI Nexus currently documents Converse API behavior. The local proxy should therefore:
- Expose an Anthropic-compatible `/v1/messages` endpoint locally.
- Translate Anthropic Messages requests to Bedrock Converse requests.
- Translate Bedrock Converse and Converse Stream responses back to Anthropic-compatible responses.
- Use `AWS_BEARER_TOKEN_BEDROCK` or `NEXUS_API_KEY` as the outbound Nexus credential.
- Avoid changing Claude Code workflows or requiring users to call Converse directly.

86
docs/PRD.md Normal file
View File

@@ -0,0 +1,86 @@
# nexus-claude-api PRD
## Product Overview
`nexus-claude-api` is a local Python API proxy that lets Claude Code use company-approved Claude models through AI Nexus.
AI Nexus currently documents that the Anthropic Messages API is temporarily unsupported and recommends AWS Bedrock Converse API as the workaround. Claude Code expects an Anthropic-compatible Messages API. This project bridges that gap locally.
Reference: [AI_NEXUS_CLAUDE.md](AI_NEXUS_CLAUDE.md)
## Users
Target users are internal developers who:
- Have an AI Nexus API key.
- Want to use Claude Code with company-approved Claude models.
- Work primarily in local development environments.
- Need Claude Code workflows such as code editing, tool use, and streaming responses.
## Goals
- Provide a local Anthropic-compatible API for Claude Code.
- Start locally with a command similar to `copilot-api`.
- Convert Anthropic Messages requests to AI Nexus Bedrock Converse requests.
- Convert Nexus responses back to Anthropic Messages responses.
- Support text, streaming, tools, tool results, and image inputs.
- Provide model discovery and token-count compatibility endpoints.
- Keep credentials local and avoid logging secrets.
## Non-goals
The first version will not include:
- OpenAI-compatible `/v1/chat/completions`.
- A dashboard.
- Nexus API key provisioning.
- Multi-user hosted gateway deployment.
- Database persistence.
- Direct Anthropic public API forwarding.
## Supported Models
The local API exposes:
- `claude-sonnet-4.6`
- `claude-opus-4.6`
- `claude-haiku-4.5`
Defaults:
- Main model: `claude-sonnet-4.6`
- Small model: `claude-haiku-4.5`
`claude-sonnet-4.6` is the default because the AI Nexus documentation recommends it as the cost-effective default for most use cases.
## User Stories
- As a developer, I can store my Nexus key in ignored local config or set `NEXUS_API_KEY`, then run `nexus-claude-api start --claude-code` to get a Claude Code launch command.
- As a Claude Code user, I can run coding workflows through local `http://127.0.0.1:4141`.
- As a Claude Code user, I can receive streaming model output.
- As a Claude Code user, I can use tool calls and tool results.
- As a multimodal user, I can send images through Claude-compatible image content blocks.
- As a developer debugging setup, I can enable verbose logs without exposing tokens.
## Acceptance Criteria
- `uv run nexus-claude-api start --port 4141 --claude-code` starts a local server.
- The server binds to `127.0.0.1` by default.
- Missing Nexus credentials fail fast with a clear error.
- `GET /health` returns healthy status.
- `GET /v1/models` returns the supported Claude models.
- `POST /v1/messages` works for non-streaming text generation.
- `POST /v1/messages` works for streaming text generation.
- Tool use and tool result conversion are covered by tests.
- Image block conversion is covered by tests.
- `POST /v1/messages/count_tokens` returns an Anthropic-compatible token count response.
- Claude Code can be launched with the printed environment command.
## Security Requirements
- Do not persist API keys automatically.
- If the user chooses hardcoded local configuration, keep it in ignored `nexus-claude-api.local.json`.
- Do not print or log API keys.
- Redact authorization headers in debug logs.
- Bind locally by default.
- Allow `0.0.0.0` only when explicitly requested.

217
docs/REQUIREMENTS_DESIGN.md Normal file
View File

@@ -0,0 +1,217 @@
# nexus-claude-api Requirements Design
## Technical Stack
- Python `>=3.11`
- Package manager: `uv`
- Web framework: FastAPI
- ASGI server: Uvicorn
- Nexus client: boto3 Bedrock Runtime client
- Validation: Pydantic
- CLI: standard library `argparse`
- Tests: pytest
All project dependencies must be managed by `uv` and the project virtual environment. Do not install global Python dependencies.
## Project Structure
```text
nexus-claude-api/
pyproject.toml
README.md
docs/
AI_NEXUS_CLAUDE.md
PRD.md
REQUIREMENTS_DESIGN.md
src/
nexus_claude_api/
__init__.py
__main__.py
cli.py
config.py
errors.py
models.py
nexus_client.py
server.py
shell.py
tokens.py
routes/
health.py
messages.py
models.py
translators/
anthropic_to_bedrock.py
bedrock_to_anthropic.py
stream.py
tests/
```
## CLI Contract
Primary command:
```powershell
uv run nexus-claude-api start --port 4141 --claude-code
```
Options:
- `--host`: default `127.0.0.1`
- `--port`, `-p`: default `4141`
- `--endpoint-url`: default `https://genai-nexus.api.corpinter.net`
- `--api-key`: optional; fallback to ignored local config, `NEXUS_API_KEY`, then `AWS_BEARER_TOKEN_BEDROCK`
- `--model`: default `claude-sonnet-4.6`
- `--small-model`: default `claude-haiku-4.5`
- `--claude-code`: print Claude Code launch command
- `--verbose`, `-v`: debug logging without secrets
When `--claude-code` is used, print a PowerShell command that sets:
- `ANTHROPIC_BASE_URL`
- `ANTHROPIC_AUTH_TOKEN`
- `ANTHROPIC_MODEL`
- `ANTHROPIC_DEFAULT_SONNET_MODEL`
- `ANTHROPIC_DEFAULT_OPUS_MODEL`
- `ANTHROPIC_SMALL_FAST_MODEL`
- `ANTHROPIC_DEFAULT_HAIKU_MODEL`
- `DISABLE_NON_ESSENTIAL_MODEL_CALLS`
- `CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC`
## HTTP API Contract
Expose:
- `GET /`
- `GET /health`
- `GET /v1/models`
- `POST /v1/messages`
- `POST /v1/messages/count_tokens`
`ANTHROPIC_AUTH_TOKEN` is printed as `dummy` because Claude Code expects an Anthropic auth token variable to exist. This local proxy does not validate that inbound token by default. It is not the Nexus key.
Inbound authentication headers are accepted for compatibility but not validated by default because the service is local. Outbound Nexus authentication uses `--api-key`, ignored local `nexus-claude-api.local.json`, `NEXUS_API_KEY`, or `AWS_BEARER_TOKEN_BEDROCK`.
## Model Mapping
Public local model IDs:
- `claude-sonnet-4.6`
- `claude-opus-4.6`
- `claude-haiku-4.5`
Backend IDs are resolved through a mapping table. The initial default mapping keeps the same IDs, except common short aliases are supported:
- `claude-sonnet-4` -> `claude-sonnet-4.6`
- `claude-opus-4` -> `claude-opus-4.6`
- `claude-haiku-4` -> `claude-haiku-4.5`
If Nexus requires different backend IDs, update the mapping without changing Claude Code-facing model IDs.
## Request Translation
Anthropic to Bedrock:
- `model` -> `modelId`
- `messages[].role` -> `role`
- string content -> `{ "text": "..." }`
- `{ "type": "text", "text": "..." }` -> `{ "text": "..." }`
- Anthropic image block -> Bedrock image block
- assistant `tool_use` -> Bedrock `toolUse`
- user `tool_result` -> Bedrock `toolResult`
- `system` -> Bedrock `system`
- `max_tokens`, `temperature`, `top_p` -> `inferenceConfig`
- `stop_sequences` -> `stopSequences`
- `tools` and `tool_choice` -> `toolConfig`
Unsupported content blocks should return `400 invalid_request_error`.
## Response Translation
Bedrock to Anthropic:
- Bedrock text content -> Anthropic text block.
- Bedrock `toolUse` -> Anthropic `tool_use`.
- Bedrock usage -> Anthropic `usage`.
- Bedrock stop reason maps to Anthropic stop reason.
Stop reason mapping:
- `end_turn` -> `end_turn`
- `max_tokens` -> `max_tokens`
- `stop_sequence` -> `stop_sequence`
- `tool_use` -> `tool_use`
- unknown -> `end_turn`
## Streaming Translation
Use `converse_stream`.
Translate Bedrock stream events to Anthropic SSE events:
- `messageStart` -> `message_start`
- `contentBlockStart` -> `content_block_start`
- `contentBlockDelta.delta.text` -> `content_block_delta` with `text_delta`
- tool input deltas -> `content_block_delta` with `input_json_delta`
- `contentBlockStop` -> `content_block_stop`
- `messageStop` -> `message_delta`, then `message_stop`
- `metadata.usage` -> usage update on final `message_delta`
- backend error -> `error`
SSE frame format:
```text
event: <event_type>
data: <json>
```
## Error Handling
Return Anthropic-compatible errors:
```json
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "..."
}
}
```
Status mapping:
- invalid request: `400`
- missing local Nexus credential: startup failure
- Nexus auth failure: `401` or `403`
- Nexus throttling: `429`
- Nexus network/timeout: `502` or `504`
- unexpected server error: `500`
## Testing
Unit tests:
- Minimal Anthropic text request -> Bedrock payload.
- System prompt conversion.
- Image block conversion.
- Tool config conversion.
- Tool use and tool result conversion.
- Bedrock text response -> Anthropic response.
- Bedrock tool response -> Anthropic tool response.
- Bedrock streaming events -> Anthropic SSE sequence.
- Token counting approximation.
Route tests:
- `GET /health`
- `GET /v1/models`
- `POST /v1/messages` non-stream
- `POST /v1/messages` stream
- `POST /v1/messages/count_tokens`
CLI tests:
- `nexus-claude-api --help`
- Claude Code command generation.
- Missing API key validation.