Estimating Statistics
1. Estimator
1.1 Notations
Let n∈N∗
Let D(s) be some distribution depending on s
Let X1,…,Xn∼D(s) be random variables
Let A be a vector space of real continuous random variables. If required, A should even be an associative algebra
1.2 Definition
An estimator of s, denoted s^ (when there is no confusion) is a function An→A
Informally, this function estimates s from the observed data.
The estimator is said to be unbiased if:
∀X1,…,Xn∈A,E[s^(X1,…,Xn)]=s 2. Estimating mean
The trivial mean estimator is:
μ^(X1,…,Xn)=n1i=1∑nXi We have:
E[μ^(X1,…,Xn)]=E[n1i=1∑nXi]=n1i=1∑nE[Xi]=μ So μ^ is unbiased. Furthermore:
V[μ^(X1,…,Xn)]=Cov[n1i=1∑nXi,n1i=1∑nXi]=n211≤i,j≤n∑Cov[Xi,Xj] Assuming that X1,…,Xn are independent, we have:
V[X]=n21i=1∑nV[Xi]=nσ2 3. Estimating variance
Assumption: X1,…,Xn are independent.
3.1 Trivial estimator σ2^
σ2^(X1,…,Xn)=n1i=1∑n(Xi−μ^(X1,…,Xn))2 the mean of this estimator is:
E[σ2^(X1,…,Xn)]=E[n1i=1∑n(Xi−μ^(X1,…,Xn))2]=n1i=1∑nE[(Xi−μ^(X1,…,Xn))2]=n1i=1∑nE[Xi2]−2E[Xiμ^(X1,…,Xn)]+E[μ^(X1,…,Xn)2]=n1i=1∑nE[Xi2]−2E[Xiμ^(X1,…,Xn)]+E[μ^(X1,…,Xn)2]=n1i=1∑nV[Xi]+E[Xi]2−2Cov[Xi,μ^(X1,…,Xn)]−2E[Xi]E[μ^(X1,…,Xn)]+V[μ^(X1,…,Xn)]+E[μ^(X1,…,Xn)]2=n1i=1∑nσ2+μ2−2nσ2−2μ2+nσ2+μ2=n1i=1∑nnn−1σ2=nn−1σ2 Thus this estimator is biased.
3.2 Bessel's Correction: σ^∗2
This is an unbiased estimator of the variance:
σ^∗2(X1,…,Xn)=n−1nσ^2(X1,…,Xn)=n−11i=1∑n(Xi−μ^(X1,…,Xn))2 3.3 God's Estimator
This in an estimator depending on the prior knowledge of the mean:
σG2^(X1,…,Xn)=n1i=1∑n(Xi−μ)2 The expected value of this estimator is:
E[σG2^(X1,…,Xn)]=E[n1i=1∑n(Xi−μ)2]=n1i=1∑nE[Xi2]−2E[μXi]+E[μ2]=n1i=1∑nV[Xi]+E[Xi]2−2μE[Xi]+μ2=n1i=1∑nσ2=σ2 Thus this estimator is unbiased
4. Estimating Covariance
Let Z1,…,Zn be n independent and identically distributed contintuos random real vectors with mean μ and covariance matrix C
4.1 Naive Estimator
Cov^(Z1,…,Zn)=n1i=1∑n(Zi−μ^(Z1,…,Zn))(Zi−μ^(Z1,…,Zn))T The expected value of this estimator is:
E[Cov^(Z1,…,Zn)]=n1i=1∑nE[(Zi−μ^(Z1,…,Zn))(Zi−μ^(Z1,…,Zn))T]=n1i=1∑nE[ZiZiT]−E[Ziμ^(Z1,…,Zn)T]−E[μ^(Z1,…,Zn)ZiT]+E[μ^(Z1,…,Zn)μ^(Z1,…,Zn)T]=n1(i=1∑nE[ZiZiT])−E[μ^(Z1,…,Zn)μ^(Z1,…,Zn)T]=n1(i=1∑nCov[Zi,Zi]+E[Zi]E[Zi]T)−E[n211≤i,j≤n∑ZiZjT]=n1(i=1∑nC+μμT)−n211≤i,j≤n∑E[ZiZjT]=n1(i=1∑nC+μμT)−n211≤i≤n∑E[ZiZiT]−n211≤i=j≤n∑E[Zi]E[Zj]T=n1i=1∑nC+μμT−n211≤i≤n∑C+μμT−n211≤i=j≤n∑μμT=C+μμT−nC+μμT−nn−1μμT=nn−1C Thus this estimator is biased.
4.2 Bessel's Correction
This is the same correction as the sample variance's correction:
Cov∗^(Z1,…,Zn)=n−1nCov^(Z1,…,Zn)=n−11i=1∑n(Zi−μ^(Z1,…,Zn))⋅(Zi−μ^(Z1,…,Zn))T This estimator is an unbiased estimator of the covariance matrix
4.3 God's Estimator
This in an estimator depending on the prior knowledge of the mean:
Cov^G(Z1,…,Zn)=n1i=1∑n(Zi−μ)(Zi−μ)T Its expected values is:
E[Cov^G(Z1,…,Zn)]=n1i=1∑nE[(Zi−μ)(Zi−μ)T]=n1i=1∑nE[ZiZiT]−E[ZiμT]−E[μZiT]+E[μμT]=n1i=1∑nC+μμT−E[Zi]μT−μE[ZiT]+μμT=n1i=1∑nC+μμT−E[Zi]μT−μE[Zi]T+μμT=n1i=1∑nC+μμT−μμT−μμT+μμT=n1i=1∑nC=C