Properties of maximum likelihood estimation

Nice things:

Not-so-nice things:

Usually easier to maximise log-likelihood.

ML estimation doesn’t depend on parameterisation of model. If $g$ is a 1:1 function, then the MLE of $g(\theta) = g(\hat{\theta})$ , and more generally we will define $g(\hat{\theta})$ to be the MLE of $g(\theta)$ . This means we can use the most convenient parameterisation (although some may have better properties than others)

ML estimation invariant to transformation of observations. If $Y$ is a function of $X$ , then $f_Y(y;\theta) = |dx/dy| f_X(x;\theta)$ , where $dx/dy$ does not depend on $\theta$ .