Using auxillary information in estimation

Some times can measure extra characteristics that have known population totals. We can often use this information to improve the precision of our estimate.

Ratio estimator

Suppose we draw SRS and obtain $y_1, y_2, ..., y_n$ for primary variable $Y$ and $x_1, x_2, ..., x_n$ for some other variable with known population mean $\mu_X$ . If $R = \mu_Y / \mu_X$ , then $\mu_Y = R \mu_X$ .

$\hat{\mu}_Y = \mu_X \frac{\bar{y}}{\bar{x}}$ $\hat{V}(\hat{\mu}_Y) = \frac{1-f}{n}s_r^2$ , where $s_r^2 = \sum \frac{(y_i - \bar{r}x_i)^2}{n-1}$

Called ratio estimator and in large samples is approximately normally distributed.

Regression estimator

Extends ratio estimator to more general linear regression case.

$\hat{\mu}_LR = \bar{y} + \hat{\beta}(\mu_X - \bar{x})$ $\hat{V}(\hat{\mu}_LR) = \frac{1-f}{n} \sigma^2_Res$ , where $\sigma^2_Res$ is the usual residual sum of squares.

Approximately normally distributed (remember less 2 degrees of freedom). Asymptotically has better variance than $\hat{\mu}_R$ or $\bar{y}$ .

Ratio and regression estimators in stratified sampling

Separate ratio estimator: estimator for each stratum and then combine using stratum weights.

Combined ratio estimator: form $\bar{y}_st$ and $\bar{x}_st$ and use to form ratio estimate

Both have $\hat{V}(\hat{\mu}_RC) = \sum { W_l^2 \frac{(1-f_l)}{n_l} s^2_l_rs}$ where $s^2_l_rs = \sum \frac{(y_l_j - \bar{r}x_l_j)^2}{n_l-1}$ .

Separate more efficient if sample sizes are large and slopes vary from stratum to stratum, combined if some of the stratum sizes are small.