29 lines
780 B
Markdown
29 lines
780 B
Markdown
---
|
||
title: "QLoRA (量化低秩适配)"
|
||
created: 2025-06-02
|
||
updated: 2025-06-02
|
||
type: concept
|
||
tags: [qlora, fine-tuning, quantization, placeholder]
|
||
sources: []
|
||
---
|
||
|
||
# QLoRA
|
||
|
||
> Quantized Low-Rank Adaptation(Dettmers et al., NeurIPS 2023),将 [[lora|LoRA]] 与 4-bit 量化结合,大幅降低 LLM 微调的内存需求。
|
||
|
||
## 核心机制
|
||
|
||
- **4-bit NormalFloat (NF4)** 量化:专为正态分布权重设计
|
||
- **双重量化**:进一步压缩量化常数
|
||
- **分页优化器**:处理内存峰值
|
||
|
||
## 在 One-Pass to Reason 中的应用
|
||
|
||
[[goru-one-pass-to-reason-2025]] 在 Qwen-3 系列(4B/8B/32B)上使用 QLoRA 进行实验,rank=32,α=64,4-bit NF4 量化。
|
||
|
||
## 相关
|
||
|
||
- [[lora]]
|
||
- [[goru-one-pass-to-reason-2025|One-Pass to Reason]]
|
||
- [[llama-factory]]
|