Comparison of Several MV Means (wk5)
Paired Comparison
Recall:
for univariate, let X i − Y i = D i ∼ N ( δ , σ d 2 ) , i = 1 , ⋯ , n
then for H 0 : δ = 0 , test stat t = n S d D ˉ ∼ H 0 t n − 1 .
Assume independent rvec D 1 , ⋯ , D n ∼ N p ( δ , Σ d ) .
then test stat T 2 = n ( D ˉ − δ ) ′ S d − 1 ( D ˉ − δ ) ∼ ( n − 1 ) n − p p F p , n − p .
H 0 : δ = 0
T 2 = n ( D ˉ ) ′ S d − 1 ( D ˉ ) ∼ n − p ( n − 1 ) p F p , n − p
reject H 0 if T 2 > n − p ( n − 1 ) p F p , n − p ( α ) .
100 ( 1 − α ) CR for δ :
( D ˉ − δ ) ′ S d − 1 ( D ˉ − δ ) ≤ n 1 n − p ( n − 1 ) p F p , n − p ( α )
100 ( 1 − α ) simultaneous CI for individual δ i :
Bonferroni’s 100 ( 1 − α ) simultaneous CI for individual δ i :
d ˉ i ± d ˉ i ± n − p ( n − 1 ) p F p , n − p ( α ) t n − 1 ( 2 p α ) n S d i 2 n S d i 2 ( 2 ) ( 3 )
====
Different Approach
let X = [ x 11 , ⋯ , x 1 p , x 21 , ⋯ , x 2 p ] 1 × 2 p ′ ∼ N 2 p ( μ , Σ ) .
then D = C X , where C = 1 0 ⋱ 0 1 ⋮ ⋮ ⋮ − 1 0 ⋱ 0 − 1 p × 2 p .
at here,
E ( D ) C o v ( D ) D = E ( C X ) = C μ = δ = C o v ( C X ) = C Σ C ′ = Σ d = C X ∼ N p ( C μ , C Σ C ′ )
therefore, given H 0 : C μ = 0 ,
test stat T 2 = n ( C X ˉ ) ′ ( CS C ′ ) − 1 ( C X ˉ ) ∼ H 0 n − p ( n − 1 ) p F p , n − p
Comparing Mean Vectors from Two Populations
Recall:
univariate, t = s q r t S p 2 ( n 1 1 + n 2 1 ) X ˉ 1 − X ˉ 2 ∼ H 0 t n 1 + n 2 − 2
for MV, assume below, where ( X 11 , ⋯ , X 1 n 1 ) and ( X 21 , ⋯ , X 2 n 2 ) are independent.
X 11 , ⋯ , X 1 n 1 ∼ N p ( μ 1 , Σ 1 )
X 21 , ⋯ , X 2 n 2 ∼ N p ( μ 2 , Σ 2 )
at here,
H 0 : μ 1 − μ 2 = 0
case 1: Σ 1 = Σ 2 = Σ
이하 대부분은 벡터에 관한 이야기이다.
X ˉ i estimates μ i , i = 1 , 2 .
S p estimates Σ , where S p = ( n 1 − 1 ) + ( n 2 − 1 ) ( n 1 − 1 ) S 1 + ( n 2 − 1 ) S 2 .
the test stats Hotelling’s T 2 = n 1 + n 2 n 1 n 2 ( X 1 ˉ − X 2 ˉ ) ′ S p − 1 ( X 1 ˉ − X 2 ˉ )
where p [( n 1 − 1 ) + ( n 2 − 1 )] ( n 1 − 1 ) + ( n 2 − 1 ) − ( p − 1 ) T 2 = p [( n 1 + n 2 − 2 )] n 1 + n 2 − p − 1 T 2 ∼ H 0 F p , n 1 + n 2 − p − 1 . (p.285 for pf)
P r [ ( ( ( X ˉ 1 − X ˉ 2 ) − ( μ 1 − μ 2 ) ) T ( ( n 1 1 + n 2 1 ) S p ) − 1 ( ( X ˉ 1 − X ˉ 2 ) − ( μ 1 − μ 2 ) ) ≤ c 2 ) ] = 1 − α
where c 2 = n 1 + n 2 − p − 1 p [( n 1 + n 2 − 2 )] T 2 ∼ H 0 F p , n 1 + n 2 − p − 1 ( α ) .
이때 constant가 역수가 되었음을 눈치.
The equality will define the boundary of a region.
The region is an ellipsoid centered at ( X ˉ 1 − X ˉ 2 ) .
Example) Testing H 0 : μ 1 − μ 2 = 0 at α = 0.05 is equivalent to see whether falls within the confidence region
Axes of the confidence region
let λ 1 , ⋯ , λ p are ev of S p .
let e 1 , ⋯ , e p are evc of S p .
then e i ‘s are the direction of CI
λ i ( n 1 1 + n 2 1 ) c 2 are the half-length of the CR Link
let c 2 = n 1 + n 2 − p − 1 p [( n 1 + n 2 − 2 )] T 2 ∼ H 0 F p , n 1 + n 2 − p − 1 ( α ) .
100 ( 1 − α ) simultaneous CI for a ′ ( μ 1 − μ 2 ) , ∀ a :
$
a ′ ( X ˉ 1 − X ˉ 2 ) ± c a ′ ( n 1 1 + n 2 1 ) S p a
Example) simultaneous CI for ( μ 1 i − μ 2 i ) , i = 1 , ⋯ , p .
let a ′ = [ 0 , ⋯ , 0 , 1 , 0 , ⋯ , 0 ] . 이때 a ′ 가 하나만 1이고 나머지 0이면, 어떤 특별한 한 axis로 proj하라는 의미. link
let μ 1 − μ 2 = [ μ 1 i − μ 2 i ] i = 1 , ⋯ , p .
a ′ ( X ˉ 1 − X ˉ 2 ) = X ˉ 1 i − X ˉ 2 i , a ′ ( n 1 1 + n 2 1 ) S p a = ( n 1 1 + n 2 1 ) S p ii
S p ii : p번째 변수의 표본 cov. 이는 단변량에서 나왔던 공통 cov, 즉 샘플 se와 표기법이 동일해지며 유사하다. (ch1) link
the Bonferroni’s 100 ( 1 − α ) % simultaneous CI for ( μ 1 i − μ 2 i ) is ( X ˉ 1 − X ˉ 2 ) ± t n 2 + n 2 − 2 , ( 2 p α ) ( n 1 1 + n 2 1 ) S p ii .
case 2: Σ 1 = Σ 2
assume n 1 − p , n 2 − p are large.
for H 0 : μ 1 − μ 2 = 0 , test stat becomes T 2 = ( X ˉ 1 − X ˉ 2 ) ′ [ n 1 1 S 1 + n 2 1 S 2 ] − 1 ( X ˉ 1 − X ˉ 2 ) ∼ H 0 χ p 2 .
note
E ( X ˉ 1 − X ˉ 2 ) = μ 1 − μ 2
C o v ( X ˉ 1 − X ˉ 2 ) = C o v ( X ˉ 1 ) + C o v ( X ˉ 2 ) − 2 C o v ( X ˉ 1 , X ˉ 2 ) = n 1 1 Σ 1 + n 2 1 Σ 2 − 0
X ˉ 1 − X ˉ 2 ∼ ⋅ N p ( μ 1 − μ 2 , n 1 1 Σ 1 + n 2 1 Σ 2 ) ( ∵ CLT )
\[3ex]
under H 0 ,
S 1 → p Σ 1 , S 2 → p Σ 2 ( ∵ WLLN )
( X ˉ 1 − X ˉ 2 ) ′ [ n 1 1 S 1 + n 2 1 S 2 ] − 1 ( X ˉ 1 − X ˉ 2 ) ∼ a pp χ p 2 ( ∵ Slutsky’s thm )
why Cov become 0???
i.e. reject H 0 if T 2 > χ p 2 ( α ) .
CI becomes
P r [ ( ( ( X ˉ 1 − X ˉ 2 ) − ( μ 1 − μ 2 ) ) T ( ( n 1 1 + n 2 1 ) S 2 ) − 1 ( ( X ˉ 1 − X ˉ 2 ) − ( μ 1 − μ 2 ) ) ≤ χ p 2 ) ] = 1 − α
차이는~~
Remark: if n 1 = n 2 = 2 ,
n 1 1 S 1 + n 2 1 S 2 = n 1 ( S 1 + S 2 ) = n 1 [ n − 1 1 n = 1 ∑ n ( X 1 i − X 1 ˉ ) ( X 1 i − X 1 ˉ ) ′ + n − 1 1 n = 1 ∑ n ( X 2 i − X 2 ˉ ) ( X 2 i − X 2 ˉ ) ′ ] = n 1 n − 1 1 S p ∗ 2 ( n − 1 ) = n 2 S p
i.e. case 1 and case 2 are the same procedure when the sample sizes are the same for large sample sizes.
100 ( 1 − α ) simultaneous CI for a ′ ( μ 1 − μ 2 ) , ∀ a :
$
a ′ ( X 1 ˉ − X 2 ˉ ) ± χ p 2 ( α ) a ′ ( n 1 1 S 1 + n 2 1 S 2 ) a
Other Statistics for Testing two Mean Vectors
Λ ∗ = ∣ B + W ∣ ∣ W ∣
Lawley-Hotelling’s Trace:
t r ( B W − 1 )
t r [ B ( B + W ) − 1 ]
Testing Equality of Covariance Matrices
H 0 : Σ 1 = Σ 2
let S p = n 1 + n 2 − 2 1 [ ( n 1 − 1 ) S 1 + ( n 2 − 1 ) S 2 ] .
M C − 1 M C − 1 = ( n 1 + n 2 − 2 ) ln ∣ S p ∣ − ( n 1 − 1 ) ln ∣ S 1 ∣ − ( n 2 − 1 ) ln ∣ S 2 ∣ = 1 − 6 ( p + 1 ) 2 p 2 + 3 p − 1 ( ( n 1 − 1 ) ( n 2 − 1 ) n 1 + n 2 − 2 − n 1 + n 2 − 2 1 ) ∼ χ v 2 , v = 2 p ( p + 1 ) ( Test statistic ) ( Scale factor )
reject H 0 if M C − 1 > χ v 2 ( α )
Profile Analysis (for g = 2 )
Recall:
H 0 : μ 1 = μ 2 , when Σ 1 = Σ 2 = Σ
T 2 = ( X 1 ˉ − X 2 ˉ ) ′ [ ( n 1 1 + n 2 1 ) S p ] − 1 ( X 1 ˉ − X 2 ˉ ) ∼ H 0 n 1 + n 2 − p − 1 ( n 1 + n 2 − 2 ) p F p , n 1 + n 2 − p − 1
let’s H 0 : C μ 1 = C μ 2 , when Σ 1 = Σ 2 = Σ , where C q × p , q ≤ p and r ank ( C ) = q .
T 2 = ( X 1 ˉ − X 2 ˉ ) ′ C ′ [ ( n 1 1 + n 2 1 ) C S p C ′ ] − 1 C ( X 1 ˉ − X 2 ˉ ) ∼ H 0 n 1 + n 2 − q − 1 ( n 1 + n 2 − 2 ) q F p , n 1 + n 2 − p − 1
Profiles are constructed for each group.
Consider two groups. Questions:
Are the profiles parallel?
⟺ ⟺ ⟺ H 0 : μ 11 − μ 12 = μ 21 − μ 22 , μ 12 − μ 13 = μ 22 − μ 23 , μ 13 − μ 14 = μ 23 − μ 24 , ⋯ , μ 1 , p − 1 − μ 1 , p = μ 2 , p − 1 − μ 2 , p H 0 : μ 11 − μ 21 = μ 12 − μ 22 = ⋯ = μ 1 p − μ 2 p C ( p − 1 ) × p H 0 : C μ 1 = C μ 2
This is equivalent to test the equal mean vector of the transformed data C X 1 and C X 2 .
Populations 1: C X 11 , ⋯ , C X 1 n 1 ∼ N p − 1 ( C μ 1 , C Σ C ′ )
Populations 2: C X 21 , ⋯ , C X 2 n 2 ∼ N p − 1 ( C μ 2 , C Σ C ′ )
reject H 0 : C μ 1 = C μ 2 (i.e. paralle profiles), if
T 2 = ( X 1 ˉ − X 2 ˉ ) ′ C ′ [ ( n 1 1 + n 2 1 ) C S p C ′ ] − 1 C ( X 1 ˉ − X 2 ˉ ) > d 2 = ( n 1 + n 2 − 2 ) n 1 + n 2 − p p − 1 F p − 1 , n 1 + n 2 − p ( α )
2. Coincident Profiles
Assuming that the profiles are parallel, are the profiles coincident?
⟺ H 0 : μ 1 i = μ 2 i , i = 1 , ⋯ , p H 0 : 1 ′ μ 1 = 1 ′ μ 2
is the case where C is replaced by 1 ′ .
reject H 0 if
T 2 = 1 ′ ( X 1 ˉ − X 2 ˉ ) [ ( n 1 1 + n 2 1 ) 1 ′ S p 1 ] − 1 ( X 1 ˉ − X 2 ˉ ) = ( n 1 1 + n 2 1 ) 1 ′ S p 1 1 ′ ( X 1 ˉ − X 2 ˉ ) 2 > F 1 , n 1 + n 2 − 2 ( α ) ( n 1 + n 2 − 2 ) n 1 + n 2 − p p − 1 F p − 1 , n 1 + n 2 − p ( α )
3. Flat Profiles
3.Assuming that the profiles are coincident, are the profiles level?
H 0 : μ 11 = μ 12 = ⋯ = μ 1 p = μ 21 = μ 22 = ⋯ = μ 2 p
by 1 and 2, we can collapse two groups into one.
X 11 , ⋯ , X 1 n 1 , X 21 , ⋯ , X 2 n 2 ∼ N p ( μ , Σ )
this is one population problem
∃ C ( p − 1 ) × p , H 0 : C μ = 0
reject H 0 , iff
T 2 = ( n 1 + n 2 ) X ˉ ′ C ′ [ CS C ′ ] − 1 C X ˉ > d 2 = ( n 1 + n 2 − 1 ) n 1 + n 2 − p + 1 p − 1 F p − 1 , n 1 + n 2 − p + 1 ( α )
이는 1번에서의 그것과는 F 분포의 df가 변화했다는 점에 주목.
X ˉ = n 1 + n 2 1 ( ∑ j = 1 n 1 X 1 j + ∑ j = 1 n 2 X 2 j ) .
S = n 1 + n 2 sample covariance matrix, using data.
Comparing Several Multivariate Population Means
Recall:
In univariate, two-sample t-test is extended to Analysis of Variance(ANOVA).
H 0 : μ 1 = ⋯ = μ g
F ∗ = SSE / d f 2 SSR / d f 1 ∼ H 0 F d f 1 , d f 2
where
SSR: sum of squared regression,
SSE: sum of squared error,
SST: sum of squared total
d f 1 = g − 1 , d f 2 = N − g , N = ∑ i = 1 g n i .
Assume g population or treatment groups, and each groups are independent . 각 population은 같은 Cov를 갖고 같은 숫자의 패러미터를 갖되 총 observation 숫자랑 각각의 population mean은 다름.
Population 1~g: X i 1 , ⋯ , X i n i ∼ N p ( μ i , Σ ) .
X ij = μ i + ϵ ij , i = 1 , ⋯ , g , j = 1 , ⋯ , n i
H 0 : μ 1 = ⋯ μ g
where X ij = X ij 1 X ij 2 ⋮ X ij p p × 1 , μ ij = μ i 1 μ i 2 ⋮ μ i p p × 1 , ϵ ij = ϵ ij 1 ϵ ij 2 ⋮ ϵ ij p p × 1
Assumptions
The random samples from different populations are independent.
All populations have a common covariance matrix Σ .
Each population is Multivariate Normal. This assumption can be relaxed by C.L.T., when the sample sizes n 1 , ⋯ , n g are large.
One-Way MANOVA
The quantities SSR, SSE and SST become matrices in MANOVA.
B W = i = 1 ∑ g n i ( X i − X ) ( X i − X ) ′ = i = 1 ∑ g j = 1 ∑ n i ( X ij − X i ) ( X ij − X i ) ′ = ( n 1 − 1 ) S 1 + ⋯ + ( n g − 1 ) S g ( SSR ) ( SSE )
( X ij − X ˉ ) ( X ij − X ˉ ) ( X ij − X ˉ ) ′ i = 1 ∑ g j = 1 ∑ n i ( X ij − X ˉ ) ( X ij − X ˉ ) ′ T = ( X i ˉ − X ˉ ) + ( X ij − X i ˉ ) = ( X i ˉ − X ˉ ) ( X i ˉ − X ˉ ) ′ + = i = 1 ∑ g n i ( X i ˉ − X ˉ ) ( X i ˉ − X ˉ ) ′ = B ( X i ˉ − X ˉ ) ( X ij − X i ˉ ) ′ + ( X ij − X i ˉ ) ( X i ˉ − X ˉ ) ′ + ( X ij − X i ˉ ) ( X ij − X i ˉ ) ′ + i = 1 ∑ g j = 1 ∑ n i ( X ij − X i ˉ ) ( X ij − X i ˉ ) ′ + W
B: Between Sum of Squares
W: Within Sum of Squares
Any test statistic will be a function of B and W. Popular test statistics use eigenvalues of B W − 1 .
let λ 1 , ⋯ , λ r be ev of B W − 1 , where r = of non-zero ev’s.
Wilk’s Lambda (LRT)
Λ = ∣ B + W ∣ ∣ W ∣ = ∣ I + B W − 1 ∣ 1 = i = 1 ∏ r ( 1 + λ 1 ) − 1
Pillai’s Trace
V = t r [ B ( B + W ) − 1 ] = t r [ B ( B ( I + B − 1 W ) ) − 1 ] = t r [ B ( I + B − 1 W ) − 1 B − 1 ] = t r [ B − 1 B ( I + B − 1 W ) − 1 ] = t r [( I + B − 1 W ) − 1 ] = t r [ I + ( B − 1 W ) − 1 ] = i = 1 ∑ r ( 1 + λ i λ i )
Lawley-Hotelling’s Trace
T = t r ( B W − 1 ) = i = 1 ∑ r λ i
Roy’s Largest Root
U = i = 1 , ⋯ , r max { λ i }
Sampling Distribution of Wilk’s Lambda
p = 1 , g ≥ 2 : p = 2 , g ≥ 2 : p ≥ 1 , g = 2 : p ≥ 1 , g ≥ 3 : large sample sizes : ( g − 1 ∑ i = 1 g n i − g ) ( Λ ∗ 1 − Λ ∗ ) ( g − 1 ∑ i = 1 g n i − g − 1 ) ( Λ ∗ 1 − Λ ∗ ) ( p n 1 + n 2 − p − 1 ) ( Λ ∗ 1 − Λ ∗ ) ( p ∑ i = 1 3 n i − p − 2 ) ( Λ ∗ 1 − Λ ∗ ) − ( i = 1 ∑ g n i − 1 − 2 p + q ) ln Λ ∗ ∼ H 0 F g , ∑ i = 1 g n i − g ∼ H 0 F 2 ( g − 1 ) , 2 ( ∑ i = 1 g n i − g − 1 ) ∼ H 0 F p , n 1 + n 2 − p − 1 ∼ H 0 F 2 p , 2 ( ∑ i = 1 g n i − p − 2 ) ∼ H 0 χ p ( g − 1 ) 2 ( Why? )