030. Estimation

Estimation

이하와 같은 linear model 고려. 이때 $x_{i}^{'}$ 는 $X$ 의 i번째 row vector이며, $E (ϵ) = 0, C o v (ϵ) = σ^{2} I = σ^{2} Σ$ .

Y_{n \times 1} = X_{n \times p} β_{p \times 1} + ϵ_{n \times 1} = (x_{i}^{'} β) + ϵ

Identifiability and Estimability

Identifiable

모델에서의 무한한 갯수의 관측치를 보유한다면, 모델의 underlying 패러미터의 참값을 획득하는 것이 가능한 성질.

A general linear model is a parameterization

E (Y) = f (X) = E (Xβ + ϵ) = Xβ + E (ϵ) = Xβ + 0 = Xβ

The parameter $β$ is identifiable if for any $β_{1}$ and $β_{2}$ $f (β_{1}) = f (β_{2})$ implies $β_{1} = β_{2}$ . If $β$ is identifiable, we say that the parameterization $f (β)$ is identifiable. (패러미터 $β$ 가 identifiable하다면, 우리는 해당 패러미터의 parameterization $f (β)$ 또한 identifiable 하다) Moreover, a vector-valued function $g (β)$ is identifiable if $f (β_{1}) = f (β_{2})$ implies $g (β_{1}) = g (β_{2})$ .

For regression models for which $r (X) = p$ , the parameters are identifiable: $X^{'} X$ is nonsingular, so if $X β_{1} = X β_{2}$ , then

β_{1} = (X^{'} X)^{- 1} X^{'} X β_{1} = (X^{'} X)^{- 1} X^{'} X β_{2} = β_{2}

A function $g (β)$ is identifiable $⟺$ $g (β)$ is a function of $f (β)$ .

Estimable

The results in the last section suggest that some linear combinations of $β$ in the less than full rank case will not be estimable.

The linear parametric function $c^{'} β$ is an estimable function if there exists a vector $a \in R^{n}$ such that $\forall β : E (a^{'} y) = c^{'} β$ .

A vector-valued linear function of $β$ , $Λ^{'} β$ is estimable if $Λ^{'} β = P^{'} Xβ$ for some matrix P; In other words, $Λ^{'} β$ is estimable if $Λ = X^{'} P \in C (X^{'})$ .

Clearly, if $Λ^{'} β$ is estimable, it is identifiable and therefore it is a reasonable thing to estimate.

estimable $\to$ identifiable

For estimable functions $Λ^{'} β = P^{'} Xβ$ , although $P$ need not be unique, its perpendicular projection (columnwise) onto $C (X)$ is unique: let $P_{1}, P_{2}$ be matrices with $Λ^{'} = P_{1}^{'} X = P_{2}^{'} X$ , then

M P_{1} = X (X^{'} X)^{-} X^{'} P_{1} = X (X^{'} X)^{-} Λ = X (X^{'} X)^{-} X^{'} P_{2} = M P_{2}

Example 2.1.4 and 2.1.5

$g (β)$ ‘s estimate, $f (Y)$ , is unbiased if $\forall β : E [f (Y)] = g (β)$ .

if $f (Y) = a_{0} + a^{'} Y$ for some scalar $a_{0}$ and vector $a$ , $f (Y)$ is a linear estimate of $Λ^{'} β$ .

if $Λ^{'} β$ $⟺$ $a_{0} = 0$ and $a^{'} X = Λ^{'}$ ; say, $Λ = X^{'} a \in C (X^{'})$ , then a linear estimate $a_{0} + a^{'} Y$ is unbiased

$Λ^{'} β$ is estimable $⟺$ there exists $ρ$ such that $E (ρ^{'} Y) = Λ^{'} β$ for any $β$ .

Estimation: Least Squares

Estimating $E (Y)$ is to take a vector in $C (X)$ closest to $Y$ ;

E (Y) \hat{β} = Xβ = β min {(Y - Xβ)^{'} (Y - Xβ)} = β min {∥ Y - Xβ ∥^{2}} \in C (X) (Least Squares Estimate of beta)

for any Least Squares Estimate $\hat{β}$ , LSE of $Λ^{'} β i s Λ^{'} \hat{β}$ , e.g., $\hat{Λ^{'} β}_{L SE} = Λ^{'} \hat{β}$ .

Theorem 2.2.1

where $M$ is the perpendicular projection operator onto $C (X)$ , then

$\hat{β}$ is a LSE of $β$ $⟺$ $X \hat{β} = M Y$

Corollary 2.2.2

$\hat{β}_{L SE} = X (X^{'} X)^{-} X^{'} Y$

Corollary 2.2.3

The unique LSE of $ρ^{'} Xβ = ρ^{'} M Y$ .

※ Note: the unique LSE of $Λ^{'} β = Λ^{'} \hat{β} = P^{'} M Y$ .

Theorem 2.2.4

the LSE of $Λ^{'} β$ is unique only if $Λ^{'} β$ is estimable: $Λ = X^{'} ρ$ if $Λ^{'} \hat{β}_{1} = Λ^{'} \hat{β}_{2}$ , so that $X \hat{β}_{1} = X \hat{β}_{2} = M Y$ .

※ Note: When $β$ is not identifiable, we need side conditions imposed on the parameters to estimate nonidentifiable parameters.

※ Note: With $r = r (X) < p$ (overparameterized model), we need $p - r$ individual side conditions to identify and estimate the parameters.

Proposition 2.2.5

If $Λ = X^{'} ρ$ , then $E (ρ^{'} M Y) = Λ^{'} β$ .

let’s decompose

Y = X \hat{β} = M Y = \hat{Y} + Y - X \hat{β} + (I - M) Y + e

이때

\hat{Y} e \in C (X) \in C (X)^{⊥} (fitted values of Y) (residuals)

Theorem 2.2.6

Let $r (X) = r$ and $C o v (ϵ) = σ^{2} I$ . At below formula, denominator is degrees of freedom for error.

Then an UE of $σ^{2}$ , MSE, is as below.

\overset{σ}{^}^{2} = \frac{Y ^{'} ( I - M ) Y}{r ank ( I - M )} = \frac{Y ^{'} ( I - M ) Y}{n - r} (MSE)

Estimation: Best Linear Unbiased

Definition 2.3.1

$a^{'} Y$ is a Best Linear Unbiased Estimate(BLUE) of $λ^{'} β$ if $a^{'} Y$ is unbiased.

e.g., $E (a^{'} Y) = λ^{'} β$ and if for any other linear unbiased estimate $b^{'} Y$ , $Va r (a^{'} Y) \leq Va r (b^{'} Y)$ .

Theorem 2.3.2: Gauss-Markov thm

Consider $Y = Xβ + ϵ$ with $E (ϵ) = 0$ , $C o v (ϵ) = σ^{2} I$ . Let $λ^{'} β$ be estimable.

Then LSE of $λ^{'} β =$ BLUE of $λ^{'} β$ .

Corollary 2.3.3

Let $σ^{2} > 0$ . Then there exists a unique BLUE for any estimable function $λ^{'} β$ .

Estimation: Maximum Likelihood

Assume that $Y \sim N_{n} (Xβ, σ^{2} I_{n})$ . Then the Maximum Likelihood Estimates (MLEs) of $β$ and $σ^{2}$ are obtained by maximizing the log of the likelihood so that

(\hat{β}, \overset{σ}{^}^{2}) = MLE of (β, σ^{2}) = (β, σ^{2}) max {- \frac{n}{2} l o g (2 π) - \frac{1}{2} lo g [(σ^{2})^{n}] - \frac{( Y - Xβ ) ^{'} ( Y - Xβ )}{2 σ ^{2}}}

\hat{β} \overset{σ}{^}^{2} = LSE of β = \frac{1}{n} {Y^{'} (I - M) Y}

Estimation: Minimum Variance Unbiased

Assume that $Y = Xβ + ϵ$ with $ϵ \sim N_{n} (0, σ^{2} I_{n})$ .

if $\forall β, σ^{2} : E {h [T (Y)]} = 0$ implies that $P r [h (T (Y)) = 0] = 1$ , A vector-valued sufficient statistic $T (Y)$ is said to be complete

If $T (Y)$ is a complete sufficient statistic, then $f (T (Y))$ is a Minimum Variance Unbiased Estimate (MVUE) of $E [f (T (Y))]$ .

Theorem 2.5.3

let $θ = (θ_{1}, \dots, θ_{s})^{'}$ and let $Y$ be a rvec with pdf as below. then $T (Y) = (T_{1} (Y), \dots, T_{s} (Y))^{'}$ is a complete sufficient statistics provided that neither $θ$ nor $T (Y)$ satisfies any linear constraints.

f (Y) = c (θ) exp [i = 1 \sum s θ_{i} T_{i} (Y)] h (Y)

Theorem 2.5.4

MSE is a $\hat{σ^{2}}_{M V U E}$ , and $\hat{ρ^{'} Xβ}_{M V U E} = ρ^{'} M Y$ whenever $ϵ \sim N (0, I)$ .

Sampling Distributions of Estimates

Assume that $Y = Xβ + ϵ$ with $ϵ \sim N_{n} (0, σ^{2} I_{n})$ . Then $Y \sim N_{n} (Xβ, σ^{2} I_{n})$ . then

\begin{alignat}{4} \Lambda ' \hat \beta &= P' M Y &&\sim N(\Lambda ' \beta , \; &&\sigma^2 P'MP&&\; \; \; ) && \; \; \; \; \; \; \; \; \; \;&& && && \\ & &&\sim N(\Lambda ' \beta , \; &&\sigma^2 \Lambda ' (X'X)^{-} \Lambda&&\; \; \; ) && && \because && \;M && =X(X'X)^- X' \\ & && && && && && && \; \hat Y && = MY &&\sim N(X\beta, \sigma^2 M) \\ \hat \beta &= (X'X)^- X'Y &&\sim N(\beta , \; &&\sigma^2 (X'X)^{-1}) && && && && && && (\text{if X is of full rank}) \end{alignat}

Do Exercise 2.1. Show that

\frac{Y ^{'} ( I - M ) Y}{σ ^{2}} \sim χ^{2} (r (I - M), \frac{β ^{'} X ^{'} ( I - M ) Xβ}{2 σ ^{2}})

Generalized Least Squares(GLS)

Assume that for some known positive definite $Σ$ ,

Y = Xβ + ϵ,

\begin{alignat}{3} Y &= X \beta &&+ \epsilon && \; \; \; \; \; \; \; \; \; \; && E(\epsilon)&&=0, \; \; &&\; Cov(\epsilon) &&= \sigma^2 \Sigma \tag{1} \\ \Sigma^{-\tfrac{1}{2}}Y &= \Sigma^{-\tfrac{1}{2}} X \beta &&+ \Sigma^{-\tfrac{1}{2}} \epsilon && \; \; \; \; \; \; \; \; \; \; && E(\Sigma^{-\tfrac{1}{2}} \epsilon)&&=0, &&\; Cov(\Sigma^{-\tfrac{1}{2}} \epsilon) &&= \sigma^2 I \tag{2, by SVD} \\ Y_\ast &= X_\ast \beta &&+ \epsilon_\ast && \; \; \; \; \; \; \; \; \; \; && E( \epsilon_\ast)&&=0, &&\; Cov( \epsilon_\ast) &&= \sigma^2 I \end{alignat}

\hat{β}_{G L S} = β min (Y_{*} - X_{*} β)^{'} (Y_{*} - X_{*} β) = β min ∥ Y_{*} - X_{*} β ∥^{2} = β min (Y - Xβ)^{'} Σ^{- 1} (Y - Xβ) (Generalized LSE (GLSE) of β)

Theorem 2.7.1

$λ^{'} β$ estimable in model (1) $⟺$ if $λ^{'} β$ is estimable in model (2).
$\hat{β}$ is GLSE of $β$ $⟺$ $X (X^{'} Σ^{- 1} X)^{-} X^{'} Σ^{- 1} Y = X \hat{β}$ , which is Normal Equation of GLS.

For any estimable function, there exists a unique GLSE.

GLSE estimate of estimable $λ^{'} β$ , is BLUE of $λ^{'} β$ .
let $ϵ \sim N (0, Σ^{2} Σ)$ . then, GLSE of estimable $λ^{'} β$ , is MVUE.
let $ϵ \sim N (0, Σ^{2} Σ)$ . then, $\hat{β}_{G L S} = \hat{β}_{M L E}$ .

Normal Equation of GLS can be rewritten as

X (X^{'} Σ^{- 1} X)^{-} X^{'} Σ^{- 1} Y A Y = X \hat{β} =

$A$ is a projection operator onto $C (X)$ .

$C o v (X \hat{β}_{G L S}) = σ^{2} * X (X^{'} Σ^{- 1} X)^{-} X^{'}$ Let $λ^{'} β$ be estimable. Then $Va r (λ^{'} \hat{β}_{G L S}) = σ^{2} * λ^{'} (X^{'} Σ^{- 1} X)^{-} λ$ .

Note: $(I - A) Y$ is residual vector of GLSE.

SS E_{G L S} MS E_{G L S} \frac{1}{σ ^ ^{2}} \frac{λ ^{'} ( β ^ _{G L S} - β _{G L S} )}{λ ^{'} ( X ^{'} Σ ^{- 1} X ) ^{-} λ} = (Y_{*} - \hat{Y}_{*})^{'} (Y_{*} - \hat{Y}_{*}) ⋮ = Y^{'} (I - A)^{'} Σ^{- 1} (I - A) Y = \overset{σ}{^}^{2} = \frac{1}{n - r ( X )} * SS E_{G L S} \sim t (n - r (x))

denominator는 $Va r (λ^{'} \hat{β}_{G L S}) = σ^{2} * λ^{'} (X^{'} Σ^{- 1} X)^{-} λ$ .

Let $Σ$ be nonsingular and $C (Σ X) \subset C (X)$ . Then least squares estimates are BLUEs.

Note: for diagonal $Σ$ , GLS is referred to as Weighted Least Squares (WLS).
Exercise 2.5.

Show that $A$ is the perpendicular projection operator onto $C (X)$ when the inner product between two vectors $x$ and $y$ is defined as $(x, y)_{Σ} \equiv x^{'} Σ^{- 1} y$ .

Quartz 4

Explorer