36 lines
1.1 KiB
Markdown
36 lines
1.1 KiB
Markdown
---
|
||
title: "Cache Health Observability(缓存健康度可观测性)"
|
||
created: 2026-05-11
|
||
updated: 2026-05-11
|
||
type: concept
|
||
tags: [observability, monitoring, cache, system-design]
|
||
sources: [[prompt-caching-architecture]]
|
||
---
|
||
|
||
# Cache Health Observability(缓存健康度可观测性)
|
||
|
||
## 定义
|
||
|
||
Cache Health Observability 是对 [[prompt-caching|Prompt Caching]] 系统运行状态的全方位监控体系,包括三大核心指标和相应的告警机制。
|
||
|
||
## 指标体系
|
||
|
||
| 指标 | 定义 | 告警阈值 |
|
||
|------|------|----------|
|
||
| [[cache-hit-ratio|CHR]] | 缓存命中请求占比 | < 95% 触发告警 |
|
||
| Invalidation Point ID | 失效首现场的字节偏移 | 每次失效记录 |
|
||
| Cost Efficiency Score | Cache-Off vs On 的 Token 差 | 按实验量化 |
|
||
|
||
## 工程实现
|
||
|
||
- 在 API 调用封装层 (wrapper) 添加埋点
|
||
- 记录每次请求的 `cache_hit` 字段
|
||
- 实时同步到监控仪表盘
|
||
- CHR 骤降时联动告警(SSH、Slack 等)
|
||
|
||
## 相关概念
|
||
|
||
- [[cache-hit-ratio|缓存命中率]]
|
||
- [[prompt-caching|Prompt Caching]]
|
||
- [[cache-invalidation|缓存失效]]
|