Files
myWiki/concepts/cache-health-observability.md

36 lines
1.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "Cache Health Observability缓存健康度可观测性"
created: 2026-05-11
updated: 2026-05-11
type: concept
tags: [observability, monitoring, cache, system-design]
sources: [[prompt-caching-architecture]]
---
# Cache Health Observability缓存健康度可观测性
## 定义
Cache Health Observability 是对 [[prompt-caching|Prompt Caching]] 系统运行状态的全方位监控体系,包括三大核心指标和相应的告警机制。
## 指标体系
| 指标 | 定义 | 告警阈值 |
|------|------|----------|
| [[cache-hit-ratio|CHR]] | 缓存命中请求占比 | < 95% 触发告警 |
| Invalidation Point ID | 失效首现场的字节偏移 | 每次失效记录 |
| Cost Efficiency Score | Cache-Off vs On Token | 按实验量化 |
## 工程实现
- API 调用封装层 (wrapper) 添加埋点
- 记录每次请求的 `cache_hit` 字段
- 实时同步到监控仪表盘
- CHR 骤降时联动告警SSHSlack
## 相关概念
- [[cache-hit-ratio|缓存命中率]]
- [[prompt-caching|Prompt Caching]]
- [[cache-invalidation|缓存失效]]