409 lines
8.9 KiB
Markdown
409 lines
8.9 KiB
Markdown
|
|
# Multi-Agent Software Delivery System
|
|||
|
|
|
|||
|
|
基于 CrewAI + Qwen3.5-flash 的多智能体软件交付系统,支持 SSE 实时推送。
|
|||
|
|
|
|||
|
|
## 📋 功能特性
|
|||
|
|
|
|||
|
|
- **多智能体协作**: 4 个专业 Agent 协同完成软件交付
|
|||
|
|
- ProductManager: 产品需求分析
|
|||
|
|
- QAEngineer: 测试计划制定
|
|||
|
|
- SoftwareDeveloper: 技术方案设计
|
|||
|
|
- Coordinator: 质量审核与交付
|
|||
|
|
|
|||
|
|
- **实时通信**: 基于 SSE 协议实时推送执行日志
|
|||
|
|
- **异步处理**: 任务异步启动,不阻塞 API 响应
|
|||
|
|
- **并发支持**: 多用户同时请求互不干扰
|
|||
|
|
|
|||
|
|
## 🔑 关键技术点
|
|||
|
|
|
|||
|
|
### 1. SSE 数据格式设计
|
|||
|
|
|
|||
|
|
统一的 JSON 格式,便于前端解析:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"task_id": "550e8400-e29b...",
|
|||
|
|
"sequence": 1,
|
|||
|
|
"agent_name": "ProductManager",
|
|||
|
|
"event_type": "thought",
|
|||
|
|
"content": "正在分析用户需求,提取关键指标...",
|
|||
|
|
"timestamp": "2023-10-27T10:00:00Z"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**字段说明**:
|
|||
|
|
- `task_id`: 任务唯一标识,用于区分不同用户的请求
|
|||
|
|
- `sequence`: 序列号(每个 task_id 独立递增),用于检测消息丢失
|
|||
|
|
- `agent_name`: 发送事件的 Agent 名称
|
|||
|
|
- `event_type`: 事件类型(start/thought/action/output/end/error)
|
|||
|
|
- `content`: 事件内容
|
|||
|
|
- `timestamp`: UTC 时间戳(ISO 8601 格式)
|
|||
|
|
|
|||
|
|
### 2. 并发处理逻辑
|
|||
|
|
|
|||
|
|
**核心挑战**:CrewAI 默认是同步运行的,而 FastAPI 和 SSE 需要异步。
|
|||
|
|
|
|||
|
|
**解决方案**:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# 在 TaskStreamQueue 中实现线程安全的消息发布
|
|||
|
|
def put_nowait(self, event: StreamEvent) -> bool:
|
|||
|
|
"""
|
|||
|
|
从同步线程(如 CrewAI 事件处理器)安全地发布事件
|
|||
|
|
|
|||
|
|
使用 run_coroutine_threadsafe 将协程提交到事件循环执行
|
|||
|
|
这是实现 CrewAI(同步)与 SSE(异步)集成的关键
|
|||
|
|
"""
|
|||
|
|
future = asyncio.run_coroutine_threadsafe(
|
|||
|
|
self.queue.put(event),
|
|||
|
|
self._loop
|
|||
|
|
)
|
|||
|
|
future.result(timeout=5.0)
|
|||
|
|
return True
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**关键实现**:
|
|||
|
|
1. 使用 `asyncio.Queue` 实现异步非阻塞消息队列
|
|||
|
|
2. 通过 `asyncio.Lock` 保证并发安全
|
|||
|
|
3. 每个 `task_id` 独立队列,实现任务隔离
|
|||
|
|
4. 使用 `run_coroutine_threadsafe` 从同步线程安全发布事件
|
|||
|
|
5. 确保 `stream_manager` 能安全地在线程间传递消息
|
|||
|
|
|
|||
|
|
### 3. 多任务隔离机制
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
class StreamManager:
|
|||
|
|
"""全局流管理器 - 管理所有任务的 SSE 流"""
|
|||
|
|
|
|||
|
|
def __init__(self):
|
|||
|
|
# task_id -> TaskStreamQueue 映射
|
|||
|
|
self.streams: Dict[str, TaskStreamQueue] = {}
|
|||
|
|
self._lock = asyncio.Lock() # 并发控制
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
- 每个 `task_id` 对应独立的 `TaskStreamQueue`
|
|||
|
|
- 使用 `asyncio.Lock` 保护字典操作
|
|||
|
|
- 定期清理已完成的流(默认每小时)
|
|||
|
|
- 序列号计数器按 `task_id` 独立维护
|
|||
|
|
|
|||
|
|
## 🚀 快速开始
|
|||
|
|
|
|||
|
|
### 1. 安装依赖
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
pip install -r requirements.txt
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. 测试模块
|
|||
|
|
|
|||
|
|
运行测试脚本验证所有模块正常:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
python test_import.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
预期输出:
|
|||
|
|
```
|
|||
|
|
[OK] Module Imports
|
|||
|
|
[OK] Agent Creation
|
|||
|
|
[OK] Stream Manager
|
|||
|
|
[OK] API Endpoints
|
|||
|
|
|
|||
|
|
[SUCCESS] All tests passed! System is ready.
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. 配置环境变量
|
|||
|
|
|
|||
|
|
复制 `.env.example` 为 `.env` 并填入 DashScope API Key:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
cp .env.example .env
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
编辑 `.env` 文件:
|
|||
|
|
```
|
|||
|
|
DASHSCOPE_API_KEY=sk-your-actual-api-key
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
获取 API Key: https://dashscope.console.aliyun.com/
|
|||
|
|
|
|||
|
|
### 4. 启动服务
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
python main.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
或使用 uvicorn:
|
|||
|
|
```bash
|
|||
|
|
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. 访问服务
|
|||
|
|
|
|||
|
|
- **API 文档**: http://localhost:8000/docs
|
|||
|
|
- **测试 UI**: http://localhost:8000/test-ui
|
|||
|
|
|
|||
|
|
## 📡 API 接口
|
|||
|
|
|
|||
|
|
### POST /api/run_task
|
|||
|
|
|
|||
|
|
启动多智能体任务
|
|||
|
|
|
|||
|
|
**请求体**:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"user_requirement": "开发一个在线商城系统...",
|
|||
|
|
"skip_confirmation": true
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**响应**:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"task_id": "550e8400-e29b-41d4-a716-446655440000",
|
|||
|
|
"status": "started",
|
|||
|
|
"message": "任务已启动..."
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### GET /api/stream/{task_id}
|
|||
|
|
|
|||
|
|
SSE 端点,订阅任务执行日志
|
|||
|
|
|
|||
|
|
**事件格式**:
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"agent": "ProductManager",
|
|||
|
|
"type": "thought",
|
|||
|
|
"content": "正在分析用户需求...",
|
|||
|
|
"timestamp": "2024-01-01T12:00:00",
|
|||
|
|
"task_id": "uuid"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**事件类型**:
|
|||
|
|
- `start`: 任务开始
|
|||
|
|
- `agent_start`: Agent 开始执行
|
|||
|
|
- `thought`: Agent 思考过程
|
|||
|
|
- `action`: Agent 执行动作
|
|||
|
|
- `output`: Agent 输出结果
|
|||
|
|
- `step_end`: 步骤完成
|
|||
|
|
- `end`: 任务结束
|
|||
|
|
- `error`: 发生错误
|
|||
|
|
|
|||
|
|
### GET /api/task/{task_id}/status
|
|||
|
|
|
|||
|
|
查询任务状态
|
|||
|
|
|
|||
|
|
### GET /api/streams
|
|||
|
|
|
|||
|
|
列出所有活跃的 SSE 流
|
|||
|
|
|
|||
|
|
## 🧪 使用示例
|
|||
|
|
|
|||
|
|
### 使用 curl 测试
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 1. 启动任务
|
|||
|
|
curl -X POST http://localhost:8000/api/run_task \
|
|||
|
|
-H "Content-Type: application/json" \
|
|||
|
|
-d '{
|
|||
|
|
"user_requirement": "开发一个简单的待办事项应用",
|
|||
|
|
"skip_confirmation": true
|
|||
|
|
}'
|
|||
|
|
|
|||
|
|
# 2. 订阅 SSE 流 (使用返回的 task_id)
|
|||
|
|
curl -N http://localhost:8000/api/stream/{task_id}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Python 客户端示例
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
import requests
|
|||
|
|
import json
|
|||
|
|
|
|||
|
|
# 启动任务
|
|||
|
|
response = requests.post(
|
|||
|
|
'http://localhost:8000/api/run_task',
|
|||
|
|
json={
|
|||
|
|
'user_requirement': '开发一个博客系统',
|
|||
|
|
'skip_confirmation': True
|
|||
|
|
}
|
|||
|
|
)
|
|||
|
|
task_data = response.json()
|
|||
|
|
task_id = task_data['task_id']
|
|||
|
|
|
|||
|
|
# 订阅 SSE 流
|
|||
|
|
import eventstream
|
|||
|
|
|
|||
|
|
with requests.get(
|
|||
|
|
f'http://localhost:8000/api/stream/{task_id}',
|
|||
|
|
stream=True
|
|||
|
|
) as r:
|
|||
|
|
for line in r.iter_lines():
|
|||
|
|
if line:
|
|||
|
|
data = line.decode('utf-8').replace('data: ', '')
|
|||
|
|
event = json.loads(data)
|
|||
|
|
print(f"[{event['agent']}] {event['type']}: {event['content']}")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 🏗️ 项目结构
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
.
|
|||
|
|
├── main.py # FastAPI 入口和路由
|
|||
|
|
├── crew_factory.py # CrewAI 工厂和事件处理器
|
|||
|
|
├── agents_config.py # Agent 配置和 Prompt 模板
|
|||
|
|
├── stream_manager.py # SSE 流管理和消息队列
|
|||
|
|
├── requirements.txt # Python 依赖
|
|||
|
|
├── test_import.py # 模块测试脚本
|
|||
|
|
├── Dockerfile # Docker 镜像构建文件
|
|||
|
|
├── docker-compose.yml # Docker Compose 配置
|
|||
|
|
├── nginx.conf # Nginx 反向代理配置
|
|||
|
|
├── .env.example # 环境变量示例
|
|||
|
|
└── README.md # 本文档
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## ⚙️ 配置说明
|
|||
|
|
|
|||
|
|
### LLM 配置
|
|||
|
|
|
|||
|
|
在 `agents_config.py` 中配置:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
QWEN_MODEL_CONFIG = {
|
|||
|
|
"model": "qwen-plus", # Qwen3.5-flash
|
|||
|
|
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
|
|||
|
|
"api_key_env": "DASHSCOPE_API_KEY",
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Agent 角色定制
|
|||
|
|
|
|||
|
|
在 `agents_config.py` 中修改各 Agent 的:
|
|||
|
|
- `role`: 角色名称
|
|||
|
|
- `goal`: 目标
|
|||
|
|
- `backstory`: 背景描述
|
|||
|
|
- `TASK_TEMPLATES`: 任务 Prompt 模板
|
|||
|
|
|
|||
|
|
## 🔧 高级用法
|
|||
|
|
|
|||
|
|
### 自定义事件处理器
|
|||
|
|
|
|||
|
|
继承 `CrewEventsHandler` 类并重写回调方法:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
class MyCustomHandler(CrewEventsHandler):
|
|||
|
|
def on_agent_output(self, agent, output):
|
|||
|
|
# 自定义处理逻辑
|
|||
|
|
pass
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 调整并发策略
|
|||
|
|
|
|||
|
|
在 `stream_manager.py` 中调整队列大小和清理策略:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
TaskStreamQueue(task_id, max_size=1000) # 默认 1000
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 🐛 故障排查
|
|||
|
|
|
|||
|
|
### 问题:SSE 连接立即断开
|
|||
|
|
|
|||
|
|
**解决**: 确保 Nginx 配置中禁用了缓冲:
|
|||
|
|
```nginx
|
|||
|
|
proxy_buffering off;
|
|||
|
|
proxy_cache off;
|
|||
|
|
proxy_request_buffering off;
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 问题:LLM 调用失败
|
|||
|
|
|
|||
|
|
**检查**:
|
|||
|
|
1. DASHSCOPE_API_KEY 是否正确配置
|
|||
|
|
2. 网络连接是否正常
|
|||
|
|
3. API Key 是否有足够额度
|
|||
|
|
|
|||
|
|
### 问题:内存占用过高
|
|||
|
|
|
|||
|
|
**解决**: 调整 `cleanup_old_streams` 的调用频率和保留时间
|
|||
|
|
|
|||
|
|
## 🏗️ Docker 部署
|
|||
|
|
|
|||
|
|
### 使用 Docker Compose(推荐)
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 1. 设置环境变量
|
|||
|
|
export DASHSCOPE_API_KEY=sk-your-api-key
|
|||
|
|
|
|||
|
|
# 2. 启动服务
|
|||
|
|
docker-compose up -d
|
|||
|
|
|
|||
|
|
# 3. 查看日志
|
|||
|
|
docker-compose logs -f multi-agent-system
|
|||
|
|
|
|||
|
|
# 4. 停止服务
|
|||
|
|
docker-compose down
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 生产环境部署(带 Nginx)
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 使用 production profile 启动(包含 Nginx)
|
|||
|
|
docker-compose --profile production up -d
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**目录结构**:
|
|||
|
|
```
|
|||
|
|
.
|
|||
|
|
├── docker-compose.yml # Docker Compose 配置
|
|||
|
|
├── Dockerfile # 应用镜像构建文件
|
|||
|
|
├── nginx.conf # Nginx 配置(反向代理 + SSE 支持)
|
|||
|
|
├── .env # 环境变量文件
|
|||
|
|
└── ...
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Nginx 关键配置
|
|||
|
|
|
|||
|
|
对于 SSE 流端点,Nginx 必须禁用缓冲:
|
|||
|
|
|
|||
|
|
```nginx
|
|||
|
|
location /api/stream/ {
|
|||
|
|
proxy_pass http://multi-agent-system:8000;
|
|||
|
|
|
|||
|
|
# HTTP/1.1 支持(SSE 必需)
|
|||
|
|
proxy_http_version 1.1;
|
|||
|
|
proxy_set_header Connection "";
|
|||
|
|
|
|||
|
|
# 禁用缓冲(关键)
|
|||
|
|
proxy_buffering off;
|
|||
|
|
proxy_cache off;
|
|||
|
|
proxy_request_buffering off;
|
|||
|
|
|
|||
|
|
# 长连接超时
|
|||
|
|
proxy_read_timeout 300s;
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 📝 注意事项
|
|||
|
|
|
|||
|
|
1. **生产环境部署**:
|
|||
|
|
- 限制 CORS 允许的来源域名
|
|||
|
|
- 添加 API 认证机制
|
|||
|
|
- 使用 Redis 等持久化队列替代内存队列
|
|||
|
|
|
|||
|
|
2. **性能优化**:
|
|||
|
|
- 调整 SSE 队列大小避免内存溢出
|
|||
|
|
- 限制单个任务的超时时间
|
|||
|
|
- 实现任务优先级队列
|
|||
|
|
|
|||
|
|
3. **安全考虑**:
|
|||
|
|
- 验证用户输入防止注入攻击
|
|||
|
|
- 限制请求频率防止滥用
|
|||
|
|
- 记录审计日志
|
|||
|
|
|
|||
|
|
## 📄 License
|
|||
|
|
|
|||
|
|
MIT License
|