17 lines
393 B
Markdown
17 lines
393 B
Markdown
---
|
|
title: "SGLang"
|
|
created: 2026-06-24
|
|
updated: 2026-06-24
|
|
type: concept
|
|
tags: ["inference-engine", "llm-serving"]
|
|
sources:
|
|
- "[[unlimited-ocr-works-2026]]"
|
|
---
|
|
|
|
# SGLang
|
|
|
|
SGLang 是高效的 LLM 推理引擎。Unlimited OCR 为其实现了 R-SWA 的 KV cache 管理支持和优化,使模型在 SGLang 上能以恒定 TPS 和 GPU 内存运行。
|
|
|
|
## 参考
|
|
- [[unlimited-ocr-works-2026]]
|