20260514:增加新内容
This commit is contained in:
35
concepts/cache-health-observability.md
Normal file
35
concepts/cache-health-observability.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
title: "Cache Health Observability(缓存健康度可观测性)"
|
||||
created: 2026-05-11
|
||||
updated: 2026-05-11
|
||||
type: concept
|
||||
tags: [observability, monitoring, cache, system-design]
|
||||
sources: [[prompt-caching-architecture]]
|
||||
---
|
||||
|
||||
# Cache Health Observability(缓存健康度可观测性)
|
||||
|
||||
## 定义
|
||||
|
||||
Cache Health Observability 是对 [[prompt-caching|Prompt Caching]] 系统运行状态的全方位监控体系,包括三大核心指标和相应的告警机制。
|
||||
|
||||
## 指标体系
|
||||
|
||||
| 指标 | 定义 | 告警阈值 |
|
||||
|------|------|----------|
|
||||
| [[cache-hit-ratio|CHR]] | 缓存命中请求占比 | < 95% 触发告警 |
|
||||
| Invalidation Point ID | 失效首现场的字节偏移 | 每次失效记录 |
|
||||
| Cost Efficiency Score | Cache-Off vs On 的 Token 差 | 按实验量化 |
|
||||
|
||||
## 工程实现
|
||||
|
||||
- 在 API 调用封装层 (wrapper) 添加埋点
|
||||
- 记录每次请求的 `cache_hit` 字段
|
||||
- 实时同步到监控仪表盘
|
||||
- CHR 骤降时联动告警(SSH、Slack 等)
|
||||
|
||||
## 相关概念
|
||||
|
||||
- [[cache-hit-ratio|缓存命中率]]
|
||||
- [[prompt-caching|Prompt Caching]]
|
||||
- [[cache-invalidation|缓存失效]]
|
||||
Reference in New Issue
Block a user