Files
myWiki/concepts/agentic-streaming-inference.md

66 lines
2.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Agentic Streaming Inference"
created: 2026-06-20
updated: 2026-06-20
type: concept
tags: ["inference", "streaming", "agent", "framework", "real-time"]
sources: ["https://arxiv.org/abs/2606.17800"]
---
# Agentic Streaming Inference (Agentic 流式推理)
**Agentic Streaming Inference** 是 [[maineCoon|MaineCoon]] 提出的**训练无关推理框架**:用三个 agentic 控制器包裹冻结的生成器,不修改模型权重即可实现千秒级稳定流式生成。
## 架构
```
Viewer ← Stream ← [Buffer Controller] → [Frozen Generator + KV-Cache]
↑ Timing ↑ Memory ↑ Content
[Cache Manager] ←→ [Director: Planner + Observer]
```
三个控制器各司其职,**内容/记忆/时间三者分离**
| 控制器 | 职责 | 核心机制 |
|--------|------|---------|
| **Director** (Planner + Observer) | 内容流 | Gemma 4 26B agent 写 prompt + 观察质量 |
| **[[agentic-cache-manager|Cache Manager]]** | 记忆 | bounded keep-set + drift control |
| **[[look-ahead-buffer-controller|Buffer Controller]]** | 时间/节奏 | pace gate 管理生成 lead |
## 关键设计原则
### 1. 分离关注点
- **Agent (Planner/Observer)** 负责认知:何时生成什么、是否退化、如何修复
- **Engine (Generator)** 负责执行:以固定节奏持续生成,不被中断
- **Manager (Cache/Buffer)** 负责治理:记住什么、何时输出
### 2. 永不中断流
- Generator 以固定 cadence 运行,永不 start/stop/step
- 所有修正通过 prompt stream 前向注入,不重置流
- Observer 在 generation head 上检查(领先 playback修复在观众看到之前完成
### 3. 优雅降级
- 分割/检查/规划失败 → 降级到更粗粒度的信号或安全续写
- Observer 端任何失败**不会卡住流**
## Director: Planner + Observer
**Planner** 按固定 beat 产生结构化 prompt
```
[VISUAL] 角色外观 + [SPEECH] 台词 + [SOUNDS] 环境音 + tags
```
维护有限规划历史和已说台词记录,确保不重复。
**Observer** 在生成前线观察质量:
- 五项 photometric 漂移指标(廉价,每帧运行)
- 周期性 VLM 检查语义缺陷
- 通过 [[forward-repair-ladder|前向修复阶梯]] 修复
**Feeder & Fast Lane**:异步队列化 promptfast lane 替换尚未生成的 beat不影响正在飞行的 chunk。
## 参考
- [[maineCoon|MaineCoon 论文]] Section 4
- [[agentic-cache-manager|Agentic Cache Manager]]
- [[look-ahead-buffer-controller|Look-Ahead Buffer Controller]]
- [[forward-repair-ladder|Forward-Repair Ladder]]