Single parameter models.
Conjugate priors
`F` a class of sampling distributions, `P` a class of prior distributions. Then `P` is conjugate for `F` if `p(theta) in P` and `p(y|theta) in F` implies `p(theta | y) in P`. Exponential family sampling distributions have natural conjugate priors: `p(y | theta) prop g(theta)^eta exp(phi(theta)^T t(y))`, `p(theta) prop g(theta)^n exp(phi(theta)^T nu)`, then posterior also in exponential form `p(theta | y) prop g(theta)^(n+eta) exp(phi(theta)^T (t(y) + nu))`.
Table of standard conjugate priors to go here:
Non-informative priors
Minimal effect on posterior - let's data speak for themselves. Conjugate priors can be be almost non-informative.
Uniform prior is natual non-informative prior for location parameters
- it is improper `int p(theta) dtheta = int dtheta = oo`
- but posterior may still be proper
- not invariant to one-to-one transformations
If density of `y` is of form `p(y - theta | theta)` then is a location density with location parameter `theta`
- consider transformation `u = y+c` and `eta = theta + c`, if `p(y|theta)` is location density, then so is `p(u | eta)`
- let `(y, theta)` and `(u, eta)` have priors `pi` and `pi^**`, identical structure so should have same non-informative distrubition
- we want `P_pi(theta in A) = P_(pi^**)(theta in A)`, ie `P_(pi^**) = P_pi(A - c) AA c in RR`
- so `int_A pi(theta) dtheta = int_(A-c) pi(theta) dtheta = int_A pi(theta - c) dtheta`, ie `pi(theta) = pi(theta -c)`, a constant function
A scale density is of the form `p(y | sigma) prop theta^(-1) p(y/sigma)`. Arguing similarly to above, must have form `pi(sigma) prop sigma^-1`.
Jeffrey's prior
Non-informative, and invariant to one-to-one transformations. `p(theta) prop [I(theta)]^(1/2) = -E(del/(del theta) log(p(y|theta)))`. Jeffrey's priors are locally uniform (ie. uniformly distributed in every small region) and therefore non-informative. However, sometimes a Jeffrey's prior violates the likelihood principle.