Convergence in distribution

Defintitions and basic properties

Let `{X_n}_(n>=0)` be a collection of rv, and let `F_n` denote the cdf of `X_n`. Then `{X_n}_(n>=1)` is said to converge in distribution, or weakly, written `X_n ->^d X_0` if:

`lim_(n->oo) F_n(x) = F_0(x) quad AA x in C(F_0)` where `C(F_0) = {x in RR: F_0 "continuous at" x}`, or
`mu_n(a, b] -> mu(a, b]`

Does not require that random vairables be defined on a common PS.

Prop: If `X_n ->^p X_0` then `X_n ->^d X_0`. Converse false in general, but if `X_n ->^d X_0` and `P(X_0 = c) = 1, c in RR`, then `X_n ->^p c`

Prop: If a cdf `F` is continuous on `RR` then it is uniformly continuous on `RR`.

Th: `{X_n}_(n>=0)` a collection of rv, with cdfs `{F_n}_(n>=0)`.
Then `X_n ->^d X` iff there exists a dense set `D sub R` st `lim_(n->oo) F_n(x) = F_0(x) quad AA x in D`.

Polya's th: `{X_n}_(n>=0)` a collection of rvs, with cdfs `{F_n}_(n>=0)`, if `F_0` is continuous on `RR` then `spr_(x in RR) |F_n(x) - F_0(x)| -> 0` as `n->oo`.

Slutysky's th: `{X_n}_(n>=1), {Y_n}_(n>=1)` sequences of rv, st. `(X_n, Y_n)` is defined on a PS `(Omega_n), F_n, P_n)`.
If `X_n ->^d X_0` and `Y_n ->_p a in RR` then

`X_n + Y_n ->^d X + a`
`X_n Y_n ->^d aX`
`X_n/Y_n ->^d X / a` provided `a != 0`.

Asymptotic normality

A special case of `F_n -> F_0`. A seq of rv's `{X_n}_(n>=1)` is said to be asymptotically normal with asymptotic mean `mu_n` and variance `sigma_n^2 > 0` if for sufficient large `n` (`EE n_0 > 0 "st" AA n >= n_0`) `(X_n - mu_n)/(sigma_n) ->^d N(0,1) "as" n->oo`. Write `X_n` as `"AN"(mu_n, sigma_n^2)`.

`{mu_n}, {sigma_n^2}` are not necessarily the mean and variance of `X_n`, (`X_n` might not have moments)
if `X_n` is `AN(mu_n, sigma_n^2)` it is uncertain what `x_n` will converge to, eg. if `mu_n = mu = 0`, `sigma_n -> 0`, `X_n ->p 0`.
if `X_n` is `AN(mu_n, sigma_n^2)` then `x_n` is `AN(bar mu_n, bar sigma_n^2)` iff `(sigma_n)/(bar sigma_n) -> 1` and `(bar mu_n - mu_n)/(sigma_n) -> 0`
`X_n` is `AN(mu_n, sigma_n^2)` the sequence of asymptotic means and variances are not unique
`X_n` is `AN(mu_n, sigma_n^2)` then `a_n x_n + b_n` is `AN(mu_n, sigma_n^2)` iff `a_n -> 1` and `(mu_n(a_n -1) + b_n)/(sigma_n) -> 0`

Vague convergence, Helly-Bray theorems and tightness

Bolzano-Weisenstraus th: If `A sub [0,1]` is infinite, then `EE {x_n}_(n>=1) "st" lim_(n->oo) x_n = x` exists in `[0,1]` (but not necessarily in A unless A is closed). There is an anologue of this for sub-probability measures (ie. `mu(RR) <= 1`).

`{mu_n}_(n>=1), mu` subprobability measures on `(RR, B(RR))`. `mu_n ->^v mu` converges vaguely if `EE D sub RR, D "dense"` and `mu_n(a,b] -> mu(a,b] qquad AA a,b in D`. For probability measures, `->^d <=> ->^v`.

Helly's selection th: If `A` is an infinite collection of sub-probability measures on `(RR, B(RR))`.
Then there exists a sequence `{mu_n}_(n>=1) sub A` and a sub-probability measure `mu` st `mu_n ->^v mu`.

Helly-Bray theorem for vague convergence: `{mu}_(n>=1), mu` sub-pm on `(RR, B(RR))`. Then `mu_n ->^v mu` iff `int f dmu_n -> int f dmu quad AA f in C_0(RR)` , where `C_0(RR) = {g | g: RR -> RR " is continuous and " lim_(|x| -> oo) = 0}`.

Helly-Bray theorem for weak convergence: `{mu}_(n>=1), mu` pm on `(RR, B(RR))`. Then `mu_n ->^v mu` iff `int f dmu_n -> int f dmu AA f in C_B(RR) = {g | g: RR -> RR " is continuous and bounded"}`.

Tightness

A sequence of pm's on `(RR, B(RR))` is called tight if `AA epsi > 0 EE M_epsi in (0, oo) "st" spr mu_n[-M,M]^c < epsi`

A sequence of rv's is called tight or stochastically bdd if the sequence of probability dists is tight, ie `AA epsi > 0 EE M_epsi in (0, oo) "st" spr(P|X_n| > M) < epsi`. Denoted `X_n = O_p(1)`.

`X_n ->^p 0` called stochastically small, `X_n = o_p(1)`
any finite collection of pms is tight
property of tightness analogous to notion of boundedness of sequence of real numbers
a tight sequence may not converge, but will have one or more weakly convergent subsequences

In general, given a stochastic quantity `T_n`, the stochastic order of `T_n - E(T_n)` is determined by the order the variance, `sigma^2`, if it exists.

T1.2.8 `{X_n}, {Y_n}` sequences of rv's. `X_n = O_p(1), Y_n = o_p(1)`.
Then:

`X_n + Y_n = O_p(1)`
`X_n Y_n = o_p(1)`

T1.2.9 Let `{mu_n}_(n>=1)` be pm's.
`{mu_n}` is tight iff it is relatively compact, ie for all subsequences `{mu_(n_i)}_(i>=1)` there exists a further subsequence `{mu_(m_i)}_(i>=1)` of `{mu_(n_i)}_(i>=1)` and pm `mu` on `(RR, B(RR))` st `mu_(m_i) ->^d mu`.

T1.2.10 `{mu_n}_(n>=1), mu`, pm's on `(RR, B(RR))`. Then `mu_n ->^d mu` iff `{mu_n}` is tight and all weakly convergent subsequences converge to `mu`.

Convergence of probability and sub-probability measures on general metric spaces

interior: `A^@ = {x in A | EE epsi >0 "st" B(x, epsi) sub A}`
closure: `barA = {x in S| EE {x_n}_(n>=1) sub A "st" x_n -> x}`
boundary: `del A = barA - A^@`
diameter: `delta(A) = spr_(x,y in A){d(x,y)}`

A set `A` is:

bounded if `delta(A) < oo`
compact if it is closed and bounded

A sequence is Cauchy is `AA epsi >0 EE N_epsi st. AA n,m > N_epsi, d(n,m) < epsi`

A metric space `(S, d)` is:

complete if if every Cauchy sequence converges to a point in the space
separable if `EE` a countable dense set `D sub S`
Polish if complete and separable

D1.3.1 Let `{mu_n}, mu` be pm on `(S, ccS)`. If `int f dmu_n -> int f dmu quad AA f in C_B(S)` then `mu_n ->^d mu`.

L1.3.1 If `F` is closed in `(S, d)` then ` AA epsi >0 EE f in C_B(S)` st `f(x) = 1 if x in F, f(x)=0 if d(x, F) >= epsi) and f(x) in [0,1] "ow"`. The f can be uniformly continuous.

T1.3.1 Let `{mu_n}_(n>=1)` be pm on `(S, ccS)`. Then the following are equivalent:

`mu_n ->^d mu`
`lim_(n->oo) int f dmu_n = int f dmu AA f "bdd and uniformly continous"`
`bar lim mu_n(F) < mu(F) AA F "closed"`
`ul lim mu_n(G) > mu(G) AA G "open"`
`lim_(n->oo) mu_n(B) = mu(B) AA B in ccS "st" mu(del B)=0`

D1.3.2

A pm on `(S, ccS)` is tight if `AA epsi > 0` there exists a compact `K` st `mu(K) > 1-epsi`.
Let `{mu_n}_(n>=1)` be a sequence of pms on `(S, ccS)`, `{mu_n}` is tight if `AA epsi > 0 quad EE "a compact" K "st" inf_(n>=1) mu_n(K) > 1- epsi`
A sequence of random variables is tight if the associated sequence of probability measures is tight

D1.3.3 A family of probability measures `Pi` on `(S, ccS)` is relatively compact if every sequnece of pm's in `Pi` contains a weakly convergent subsequence `{mu_(n_i)}_(i>=1)` and a pm `mu` (not necessarily in `Pi`) st `mu_(n_i) ->^d mu`.

T1.3.3 (Prohorov direct half) For a family of pm's, tightness `=>` relative compactness. T1.3.4 (Prohorov converse half) If `(S, ccS)` is Polish, then relative compactness `=>` tightness.

Skorokhod's construction and continuous mapping theories

Let `F` be a df on `RR`, and for any `0 < p < 1` we define the quartile function `F^(-1)(p) = inf{x | F(x)>=p} = spr{x | F(x) < p}`.

L1.4.1 Let F be a df, then `F^(-1)` is non-decreasing and left-continuous, also saticcying:

`F^(-1)(F(x)) <= x quad AA x in RR`
`F(F^(-1)(t)) >t quad AA t in [0,1]`
`F(x) >= t iff x >= F^(-1)(t)`

L1.4.2 If `F_n -> F` then the set `D = {t | t in [0,1], F_n^(-1) !-> F^(-1)}` is at most countable.

T1.4.3 (Skorokhod). Let `{X_n}_(n>=1)` and `X` be rv's on `(RR, B(RR))` st `X_n ->^d X`.
Then there exists rv's `{Y_n}_(n>=1)` and `Y` on `((0,1), B(0,1), "LM")` st `X_n =^d Y_n` and `X =^d Y` and `Y_n ->^(wp1) Y`

valid for more general space
for a df `F`, if `U ~ U(0,1)` then `F^(-1)(U)` is an rv with `F` as its df
we know `X_n ->^(wp1) X => X_n ->^p X => X_n ->^d X`, T1.4.3 is a converse of this in a sense

Continuous mapping theorems

`f: RR -> RR`, Borel measurable st `P(D_f) = 0`

P1.4.4 If `X_n ->_("or p")^(wp1) X` and then `f(X_n) ->_("or p")^(wp1) f(X)` respectively.

T1.4.5 If `X_n ->^d X` then `f(X_n) ->^d f(X)`.

Convergence of moments

`X_n ->^d X iff Ef(X_n) -> Ef(X) quad AA f in C_B(RR)`. However to ensure `E|X_n|^k -> E|X|^k` we need extra conditions.

D1.5.1 A sequence of random variables `{X_n}_(n>=1)` is uniformly integrable if `lim_(A->oo) spr_n E(|X_n| I(|X_n| > A)) = 0`, or `lim_(A->oo) int_(|X_n| > A) dP = 0` uniformly over `n`.

L1.5.1 A sequence of random variables is u.i. iff:

`spr_n E|X_n| < oo` and
`AA epsi >0 EE delta_epsi >0 " st " AA E in ccF P(E) < delta => int_E |X_n| dP < epsi AA n`

L1.5.2 If `EE epsi > 0 "st" spr_n E|X_n|^(1+epsi) < oo` then `{X_n}` is u.i.

T1.5.3 If `X_n ->^d X "in" (RR, B(RR))` and `{X_n}^r` is u.i. for some `r > 0` then:

`E|X|^r < oo`
`EX_n^r -> EX^r`
`E|X_n|^r -> E|X|^r`

T1.5.4 If `X_n ->^d X` and `E|X_n|^r -> E|X|^r < oo, r > 0` then `{X_n}^r` is u.i.

T1.5.5 (Frechet-Shoket). Let `{X_n}` be a sequence of random variables st `EX^k -> m_k ^d X`

Sufficient conditions for convergence

`C_B^oo = {f | f "has bdd derivatives of all orders on" RR}`.
`n_h(x) = 1/(sqrt(2pi)h) e^(-x^2/(2h^2))`
`f_h(x) = int^oo_(-oo) f(x-y) h_n(y) dy = int_(-oo)^oo h_n(x+y) f(x) dy`

T2.5.1 `f in C_B => AA h > 0 f_h in C_B`. `f in C_BU => f_h -> f` uniformly in `RR` as `h -> 0`.

T2.5.2 Let `{mu_n}` and `mu` be pms of `(RR, B(RR))`. If `AA f in C_B^oo` `int f dmu_n -> int f dmu` then `mu_n ->^d mu`.