Fundamentals of statistical inference
Basic concepts
- underlying probability space: `(Omega, ccF, P)`
- sampling probability space: `(bbX, ccx, ccP^x)`
- statistics: `(RR^k, B(RR^k))`
Parametric vs non-parametric porbability models:
- if `ccP^x = {P_theta, theta in Theta sub RR^d, d in NN^+}` then `ccP^x` indexed by a finite-dimensional paramter, and `Theta` is called the parameter space
- `ccP^x` is non-parametric if it can't be indexed by a finite-dimensional parameter
D5.1.1 (Exponential family). A parameter family of pms `{P_theta}_(theta in Theta)` is dominated by a `sigma`-finite measure `nu` on `(bbX, ccx)` is called an exponential family iff `(dP_theta)/(dnu)(omega) = exp[ (eta(theta))^T T(omega) - zeta(theta)] h(omega)`, `omega in bbX`, where:
- `T` is a random venctor on `RR^p` where `p in NN^+`
- `eta: Theta -> RR^p`
- `h: bbX, ccx -> RR` is non-negative, measurable
- `zeta: Theta -> RR` is a normalising constant to make the RHS a real density
Remarks:
- often `nu` is counting or Lebesgue measure
the representation is not unique as any transformation `tilde eta(theta) = D eta(theta)`, `D` `p xx p`, invertible, will give naother representation.
natural exponential family, `eta = eta(theta)` the natural parameter, and `Xi = {eta(theta) | theta in Theta}` the natural parameter space, then `f_eta = dP/dnu (x) = exp{eta^T T(X) - xi(eta)} h(x)`. If `Xi` contains an open set, then is of full rank
D5.1.2 (Location-scale family) Let `P` be a known pm on `(RR^k, B(RR^k))` and `nu sub RR^k` and `M_k = {k xx k " positive definite matrices"}`. The family of pms `{P_((mu, sigma_k)) = P(Sigma^(-1/2)( * - mu), mu in nu, Sigma in M_k}` is called a location-scale family on `(RR^k, B(RR^k))`
- `{ P_((mu, I_k)), mu in vu}` is called a location family
- `{ P_((0, Sigma_k)), Sigma_k in M_k}` is called a scale family
Sufficiency and completeness
D5.2.1 Given a random obsverable `X`, measurable function `T: bbX -> RR^d, d in NN^+` is called a statistic if `T(X)` is known whenever `X` is. `sigma(T(X)) sub sigma(X)`
D5.2.2 Let `(bbbX, sfX, sfP)` be an ops of `X`, and let `sfG` be a sub-`sigma`-field. `sfG` is sufficient for `sfP` if `AA in sfX quad P(A|sfG) = E_P(I_A | sfG)` does not depend on `P in sfP`. That is the conditional probability of `A | sfG` is the same for all `P in sfP`.
P5.2.1 `sfG` is sufficient for `sfP` iff for any bounded `sfX`-measurable function `f: bbbX -> RR` there exists a `sfG`-measurable function st `g = E_P(f | sfG) quad AA P in sfP`
L5.2.2 Suppose the ops is dominated by a `sigma`-finite measure `lambda` then there exits a countable subset `sfP_0 sub sfP` st `sfP << sfP_0`
C5.2.3 If a family of pm on `(bbbX, sfX)` is dominated by a `sigma`-finite measure `lambda` then `sfP` is dominated by a pm `Q = sum_i c_i P_i quad c_i >0, sum c_i = 1, P_i in sfP`
T5.2.4 (Halmann-Savage) Let `(bbbX, sfX, sfP^x)` be a dominated ops and `sfB` be a sub-`sigma`-field.
- if there exists a pm `mu` "st" `sfP << mu` and `AA P in sfP` and `dP/dmu` is `sfB`-measurable then `sfB` is sufficient for `sfP`
- conversely, if `sfB` is sufficient then there exits a pm `mu` st `sfP ~ mu` and `dP/dmu` is `sfB`-measurable `AA P in sfP`
T5.2.5 (Factorisation theorem). Suppose `(bbbX, sfX, sfP)` an ops, `sfP << sigma-"finite" lambda`. Then `T(x)` is sufficient for `sfP` iff there exists non-negative measurable function `h: bbbX -> RR^d` does not depend on `P in sfP` and a non-negative `g_P` is measurable on `sigma(T(X))` st `dP/dlambda (x) = g_P(T(x))h(X) quad AA x in bbbX`
Sufficiency is very much determined by the structure of `sfP`. If `sfP` is not a proper set of models the discussion of sufficiency is quite hypothetical.
Minimal sufficient
D5.2.3 Let `(bbbX, sfX, sfP^x)` be an ops and `sfC sub sfX` be a sub-`sigma`-field. `sfC` is necessary for `sfP` if for all given sufficient `sigma`-fields `sfB` for `sfP` and any `C in sfC` there exists `B in sfB` st `P(B o+ C) = 0 quad AA P in sfP` (`o+` = XOR).
- `sfC = {o/, Omega}` is necessary
- If `sfC` is necessary and `sfB` is sufficient then `sfC sub sfB` up to P-null set in the sense `C in sfC, b in sfB => P(C) = P(C nn B)`
D5.2.2 A statistic `T:(bbbX, sfX) -> (RR^d,B(RR^d))` is necessary for `sfP` if for any sufficient statistic `S: (bbbX, sfX) -> (RR^d, B(RR^d))` there exists a measurable `H: RR^q -> RR^d` st `T = H(S) "wp1" P quad AA P in Q, q in NN^+, d in NN^+ uu {oo}`
A `sigma`-field is minimal sufficient if it is sufficient and necessary.
T5.2.6 Let `(bbbX, sfX, sfP)` be an ops, dominated by `sigma`-finite `lambda`. Then a minimal sufficient `sigma`-field for `sfP` exists.
T5.2.7 Let `(bbbX, sfX, sfP)` be an ops and `sfP_1 sub sfP` st `sfP << sfP_1`. If `T(X)` is sufficient for `sfP` and minimal sufficent for `sfP_1` then `T` is minimal sufficient for `sfP`
T5.2.8 Let `(bbbX, sfX, sfP)` be an ops where `sfP = {P_0, ..., P_k}` is a finite set of pms with densities `{f_1, ..., f_k}` wrt a `sigma`-finite `lambda`. Then `T(X) = ((f_1(x))/(f_0(x)), ..., (f_k(x))/(f_0(x)))I_A` is a minimal sufficient statistic for `sfP`
Completeness
D5.3.1 Let `(bbbX, sfX, sfP)` be an ops, and `sfB` be a sub-`sigma` field of `sfX`.
- `sfB` is complete for the family `sfP` if `AA sfB`-measurable function f, with `E_P f < oo quad AA P in sfP`, `int f dP = 0 quad AA P in sfP => f = 0 "wp1" P quad AA P in sfP`
- `sfB` is bounded complete for `sfP` if the above holds for all bounded measurable f.
- A statistic `T:bbbX -> RR^d` is bounded complete if `sigma(T)` is bounded complete
- Let `(bbbY, sfY)` be a measurable space and `sfP^y` be a set of pms on `(bbbY, sfY)`. `sfP^y` is complete if `AA sfY`-measurable real valued functions f (with `int |f| P <0 quad AA P in sfP`) st `int f dP = 0 quad AA P in sfP^y => f = 0 "wp1" P quad AA P in sfP^y`
P5.3.1 Let `Y: bbbX -> bbbY` measurable, st `sigma(Y)` is complete for `sfP`. Let `sfP_y = {P @ Y^(-1)}`, then `sfP_y` is complete.
L5.3.2 Let `sfP = {P_y: eta in Xi}` be a natural expoential family woth density `(dP_eta)/(dnu)(x) = exp(eta^T T(x) - xi(eta))h(x) quad x in bbbX`. Suppose `T(X) = (Y(X), U(X))` and `eta = (theta, phi)` where `Y` and `theta` have same dimension. Then `Y` has density `f_y(y) = exp(theta^T Y - xi(y))` with a `sigma`-finite measure `lambda` depending on `phi`
If `T=Y` then `T(X)` has distriubtion in natural exponential family form.
Given an rv `X` its moment generating function (MGF) is defined as `psi_X(t) = E(e^(t^T X))`. Has similar properties to characteristic functions but `psi_X(t) ` can take value of `oo`
L5.3.3 Let `X` and `Y` be rvs in `RR^k`. If `psi_X(t) = psi_Y(t) < oo` for `|t| < delta, delta > 0`, then `X` and `Y` have the same distribution.
T5.3.4 Let `sfP = {P_eta; eta in Xi}` be a natural exponential family of full rank with density `(dP_eta)/(dnu)(x) = exp(eta^T T(x) - xi(eta))h(x)`. Then `T(X)` is complete.
T5.3.5 Let `(bbbX, sfX, sfP)` be an ops. If `S sub sfX` is bounded complete, and sufficient `sigma`-field for `sfP` then `S` is minimal sufficient for `sfP`
D5.3.2 Let `(bbbX, sfX, sfP)` be an ops. A `sigma`-field `sfB sub sfX` is ancilliary for `sfP` if `AA sfB`-measurable statistic `V(X)` the distribution of `V` does not depend on `P in sfP`
T5.3.6 If `sfB` is complete and sufficient and `sfC` is ancillary `sigma`-field for `sfP_1`, then `sfB` and `sfC` are independent for `P in sfP`.
Let `V(X)` and `T(X)` be two statistics on `X` on `(bbbX, sfX, sfP)`. If `V` is ancillary and `T(X)` is complete and sufficient for `sfP` then `V(X)` and `T(X)` are indepdendent for `P in sfP`