4.5 KiB
title, created, updated, type, tags, sources
| title | created | updated | type | tags | sources | |||
|---|---|---|---|---|---|---|---|---|
| Cramér-Rao Lower Bound (CRLB) | 2026-04-17 | 2026-04-17 | concept |
|
|
Cramér-Rao Lower Bound (CRLB)
Definition
The Cramér-Rao Lower Bound (CRLB) states that for any unbiased estimator of a population parameter \theta, the lowest possible variance is the reciprocal of the Fisher Information I(\theta):
\text{Var}(\hat{\theta}) \geq \frac{1}{I(\theta)}
It represents a fundamental limit in statistical estimation: no matter how clever your estimation method is, you cannot beat this bound.
Key Concepts
1. The Score Function
The score g(\theta; \mathbf{x}) is the derivative of the log-likelihood with respect to the parameter:
g(\theta; \mathbf{x}) = \frac{\partial}{\partial \theta} \log f(\mathbf{x} \mid \theta)
- It measures the "force" the data exerts on the parameter estimate.
- Crucial property:
\mathbb{E}[g(\theta; \mathbf{x})] = 0(under regularity conditions).
2. Fisher Information
Fisher Information I(\theta) is the variance of the score function:
I(\theta) = \text{Var}(g(\theta; \mathbf{x})) = \mathbb{E}\left[ \left( \frac{\partial}{\partial \theta} \log f(\mathbf{x} \mid \theta) \right)^2 \right]
Alternative expression (via curvature):
I(\theta) = -\mathbb{E}\left[ \frac{\partial^2}{\partial \theta^2} \log f(\mathbf{x} \mid \theta) \right]
This connects information directly to the curvature of the log-likelihood function. A sharper peak (higher curvature) means higher information and a tighter bound.
Properties:
I(\theta)is proportional to sample sizen(I_n = n \cdot I_1).- Higher variance in the data means lower information per data point.
3. Observed vs. Expected Information
- Expected Information: Uses the true parameter and expectation over all possible data. Formula-based.
- Observed Information: Uses the actual observed data and the estimated parameter
\hat{\theta}. Computed from the Hessian of the log-likelihood at\hat{\theta}. - In practice (especially in MLE), standard errors are calculated using the observed information.
Classic Examples
Normal Distribution (Mean Estimation)
- Parameter:
\mu - Score:
g(\mu) = \frac{n}{\sigma^2}(\bar{x} - \mu) - Fisher Information:
I = \frac{n}{\sigma^2} - CRLB:
\frac{\sigma^2}{n} - Conclusion: The sample mean
\bar{x}is the "best" unbiased estimator, as its variance exactly hits the bound.
Binomial Distribution (Proportion Estimation)
- Parameter:
\pi - Score:
g(\pi) = \frac{k}{\pi} - \frac{n-k}{1-\pi} - Fisher Information:
I = \frac{n}{\pi(1-\pi)} - CRLB:
\frac{\pi(1-\pi)}{n} - Conclusion: The sample proportion
\hat{\pi} = k/nis the optimal unbiased estimator.
Connection to Maximum Likelihood Estimation (MLE)
- MLE is consistent and asymptotically efficient.
- As sample size
n \to \infty, the variance of the MLE approaches the CRLB:\text{Var}(\hat{\theta}_{\text{MLE}}) \approx 1/I(\theta). - This is why standard errors reported by MLE software are calculated as
1/\sqrt{I_{\text{observed}}}.
Role in Computerized Adaptive Testing (CAT)
In CAT, the CRLB dictates the theoretical limit of measurement precision.
- Each question contributes a certain amount of Fisher Information
I_i(\theta). - The test continues until the accumulated information
I(\theta) = \sum I_i(\theta)is large enough that1/I(\theta)(the minimum possible variance) is below a predefined threshold. - 选题策略 (Item Selection): Choosing the item with the maximum
I_i(\theta)at the current ability estimate\hat{\theta}is equivalent to driving the CRLB down as fast as possible.
Multidimensional Extension (Information Matrix)
For a vector of parameters \boldsymbol{\theta}, the Fisher Information becomes a matrix \mathbf{I}(\boldsymbol{\theta}). The CRLB states that the covariance matrix of any unbiased estimator satisfies:
\text{Cov}(\hat{\boldsymbol{\theta}}) \succeq \mathbf{I}(\boldsymbol{\theta})^{-1}
(where \succeq denotes positive semi-definiteness).
相关概念
- computerized-adaptive-testing — CAT 的核心目标是最小化能力估计方差,CRLB 提供了理论下界,选题策略本质上是在最大化 Fisher 信息以快速逼近该下界。
- eml-universal-operator — EML 树的梯度优化依赖于对参数空间的曲率估计,与 CRLB 中 Fisher 信息作为对数似然曲率的数学本质相通。