api-server/docs/ai-runtime-internal-api-protocol.md
wangdl eea9e3e7c6
Some checks failed
Deploy API Server / build-and-deploy (push) Has been cancelled
feat: API-Runtime 内部通信协议与 DTO (API-AI-001)
定义 9 个 internal/runtime 接口的完整协议:Poll/Lock/Heartbeat/Snapshot/
Credential Resolve/Result/Fail/InvocationLog/Health。新增 RuntimeInternalDto
类型文件,复用 InternalAuthGuard 鉴权,与 Rust 侧可直接对齐。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 20:35:20 +08:00

8.6 KiB
Raw Blame History

API 与 Rust Runtime 内部通信协议

1. 概述

本文档定义主 API 与 Rust Heavy Runtime 之间的内部 HTTP 通信协议。

通信方向:

  • Runtime → API拉取 Job、提交结果、提交日志
  • API → Runtime健康检查可选

2. 鉴权

所有 /internal/runtime/* 接口使用 InternalAuthGuard

请求头

x-internal-api-key: <RUNTIME_SERVICE_TOKEN>
x-runtime-instance-id: runtime-001
  • x-internal-api-key:与 API 环境变量 INTERNAL_API_KEY 一致
  • x-runtime-instance-idRuntime 实例标识,记录到日志

安全约束

  • 普通用户 JWT 不可访问 internal 接口
  • service token 不可访问普通用户 API
  • Runtime 不可通过 internal 接口访问非当前 job 所需数据

3. 错误响应格式

所有 internal 接口失败时返回:

{
  "statusCode": 400,
  "errorCode": "INVALID_SNAPSHOT",
  "message": "Snapshot has expired for this job",
  "timestamp": "2026-06-11T10:00:00.000Z"
}

错误码

错误码 HTTP 说明 retryable
JOB_NOT_FOUND 404 Job 不存在 false
JOB_ALREADY_LOCKED 409 已被其他 Runtime 锁定 true
SNAPSHOT_EXPIRED 410 快照已过期 true
SNAPSHOT_NOT_FOUND 404 快照不存在 false
CREDENTIAL_NOT_FOUND 404 凭证不存在 false
CREDENTIAL_INVALID 422 凭证无效 false
RESULT_ALREADY_EXISTS 409 重复提交 false
RESULT_SCHEMA_UNSUPPORTED 422 schema 版本不支持 false
RUNTIME_VERSION_INCOMPATIBLE 422 Runtime 版本不兼容 false
INTERNAL_ERROR 500 内部错误 true

4. 接口详情

4.1 Poll Jobs

POST /internal/runtime/jobs/poll

Runtime 拉取待执行 job。API 根据 Runtime 的 supportedJobTypescapabilities 过滤兼容的 job。

请求

{
  "runtimeInstanceId": "runtime-001",
  "supportedJobTypes": ["learning_state_analysis", "quiz_generation"],
  "limit": 5,
  "capabilities": {
    "supportedSnapshotVersions": ["ai_snapshot_v1"],
    "supportedOutputSchemaVersions": ["analysis_output_v1", "quiz_output_v1"]
  }
}

响应 200

{
  "jobs": [
    {
      "id": "job-abc123",
      "jobType": "learning_state_analysis",
      "targetType": "material",
      "targetId": "mat-xyz",
      "priority": 0,
      "snapshotId": "snap-001",
      "promptVersion": "learning_state_v1",
      "outputSchemaVersion": "analysis_output_v1"
    }
  ]
}

4.2 Lock Job

POST /internal/runtime/jobs/{jobId}/lock

Runtime 锁定一个 job获取执行权。

请求

{
  "runtimeInstanceId": "runtime-001"
}

响应 200

{
  "jobId": "job-abc123",
  "status": "locked",
  "lockUntil": 1700000000123
}

4.3 Heartbeat

POST /internal/runtime/jobs/{jobId}/heartbeat

Runtime 延长 lock 有效期。

请求

{
  "runtimeInstanceId": "runtime-001"
}

响应 204:空 body仅延长 lockUntil

4.4 Get Snapshot

GET /internal/runtime/jobs/{jobId}/snapshot

Runtime 获取 job 关联的 LearningAnalysisSnapshot。

响应 200

{
  "jobId": "job-abc123",
  "snapshotId": "snap-001",
  "snapshotVersion": "ai_snapshot_v1",
  "privacyScope": { "allowDocumentContent": true },
  "userProfile": { "learningGoal": "exam", "currentLevel": "intermediate" },
  "aiSettings": { "allowAiAnalysis": true },
  "learningBehaviorSummary": { "totalActiveSeconds": 3600 },
  "materialProgressSummary": { "progress": 0.6 },
  "behaviorSignals": { "engagementSignal": "high" },
  "scoreSignals": { "masteryRiskScore": 0.3 },
  "constraints": { "dailyAvailableMinutes": 60 },
  "allowedModelFields": ["learningGoal", "currentLevel"]
}

错误

  • 404 SNAPSHOT_NOT_FOUND — 快照不存在
  • 410 SNAPSHOT_EXPIRED — 快照已过期Runtime 应提交 retryable fail

4.5 Resolve Credential

POST /internal/runtime/model-credentials/resolve

Runtime 获取模型调用凭证。platform_key 模式返回平台 keyuser_deepseek_key 模式解密用户 key 后返回。

请求

{
  "jobId": "job-abc123",
  "apiKeyMode": "user_deepseek_key",
  "credentialId": "cred-001",
  "provider": "deepseek"
}

响应 200

{
  "provider": "deepseek",
  "model": "deepseek-chat",
  "baseUrl": "https://api.deepseek.com/v1",
  "apiKey": "sk-xxxx",
  "apiKeyMode": "user_deepseek_key"
}

安全要求

  • 明文 apiKey 只在响应中短暂出现,不写日志
  • apiKey 不返回给 iOS / Admin
  • 用户 key 必须属于 job.userId
  • platform key 由 Runtime 环境变量优先使用API 可选返回

错误

  • 404 CREDENTIAL_NOT_FOUND
  • 422 CREDENTIAL_INVALID

4.6 Submit Result

POST /internal/runtime/jobs/{jobId}/result

Runtime 提交执行成功的结果。

请求

{
  "runtimeInstanceId": "runtime-001",
  "schemaVersion": "analysis_output_v1",
  "status": "succeeded",
  "rawOutput": { "learningState": "in_progress", "confidence": 0.85 },
  "validatedOutput": { "learningState": "in_progress", "riskLevel": "low" },
  "validationErrors": [],
  "usage": {
    "inputTokens": 1200,
    "outputTokens": 450,
    "totalTokens": 1650,
    "latencyMs": 3200,
    "costEstimate": 3
  },
  "attemptNo": 0,
  "outputHash": "sha256-abc123"
}

幂等规则

  • resultIdempotencyKey = jobId + attemptNo + outputHash
  • 相同 key 重复提交返回 200幂等
  • 已有 succeeded result 且 outputHash 不同返回 409 RESULT_ALREADY_EXISTS

响应 201created

错误

  • 409 RESULT_ALREADY_EXISTS
  • 422 RESULT_SCHEMA_UNSUPPORTED

4.7 Submit Failure

POST /internal/runtime/jobs/{jobId}/fail

Runtime 提交执行失败的原因。

请求

{
  "runtimeInstanceId": "runtime-001",
  "errorCode": "MODEL_TIMEOUT",
  "errorMessage": "DeepSeek request timed out after 30s",
  "retryable": true,
  "rawError": "connection timeout"
}

处理规则

  • retryable=trueretryCount < maxRetryCountjob 回到 pending
  • retryable=false 或达到 maxRetryCountjob 变为 failed
  • rawError 中不得包含 apiKey

响应 200acknowledged

4.8 Submit Invocation Logs

POST /internal/runtime/invocation-logs

Runtime 提交模型调用日志(批量)。

请求

{
  "logs": [
    {
      "jobId": "job-abc123",
      "provider": "deepseek",
      "model": "deepseek-chat",
      "apiKeyMode": "user_deepseek_key",
      "credentialId": "cred-001",
      "promptName": "learning_state_analysis",
      "promptVersion": "learning_state_v1",
      "outputSchemaVersion": "analysis_output_v1",
      "inputTokens": 1200,
      "outputTokens": 450,
      "totalTokens": 1650,
      "latencyMs": 3200,
      "costEstimate": 3,
      "success": true,
      "retryCount": 0,
      "runtimeInstanceId": "runtime-001",
      "traceId": "trace-xyz",
      "correlationId": "corr-abc"
    }
  ]
}

约束

  • 不允许 apiKey 字段
  • 失败调用也要提交日志
  • 日志提交失败不导致主任务崩溃

响应 201created

4.9 Health可选

GET /internal/runtime/health

API 查询 Runtime 健康状态。此接口由 Runtime 暴露(非 API 暴露)。

响应 200

{
  "runtimeInstanceId": "runtime-001",
  "status": "ok",
  "version": "0.1.0",
  "startedAt": 1700000000000,
  "lastJobAt": 1700000000123,
  "activeJobs": 2
}

5. 接口总览

方法 路径 调用方 鉴权
POST /internal/runtime/jobs/poll Runtime InternalAuthGuard
POST /internal/runtime/jobs/{jobId}/lock Runtime InternalAuthGuard
POST /internal/runtime/jobs/{jobId}/heartbeat Runtime InternalAuthGuard
GET /internal/runtime/jobs/{jobId}/snapshot Runtime InternalAuthGuard
POST /internal/runtime/model-credentials/resolve Runtime InternalAuthGuard
POST /internal/runtime/jobs/{jobId}/result Runtime InternalAuthGuard
POST /internal/runtime/jobs/{jobId}/fail Runtime InternalAuthGuard
POST /internal/runtime/invocation-logs Runtime InternalAuthGuard
GET /internal/runtime/health API —(检查外部 Runtime

6. 验收清单

  • 所有 internal 接口有 DTO 定义(runtime-internal.dto.ts
  • 所有 internal 接口有鉴权设计(复用 InternalAuthGuard
  • 所有失败返回包含 errorCode / message
  • Runtime result 支持结构化 payloadvalidatedOutput
  • Runtime failure 支持 retryable 标记
  • Credential resolve 接口明确不记录明文 key
  • 接口命名、字段命名与 Runtime 项目可直接对齐