Multistage sampling

As with cluster sampling, we select $c$ of $C$ clusters, but now instead of sampling all units in each cluster, we take a random sample. Most large surveys carried out this way.

Advantages:

cost and speed
convenience (only need list of clusters and individuals in selected clusters)
usually more accurately than cluster for same total size

Disadvantages:

less accurate than SRS of same size (but more accurate for same cost)
further analysis is difficult

Basic results

\[ \hat{\mu}_R = \frac{\sum_C M_i \bar{y}_i}{\sum_C M_i} \]

Let $f_i = \frac{c}{C}$ and $f_{2i} = \frac{m_i}{M_i}$ , then

\[ \hat{V}(\hat{\mu}_R) = \frac{1-f_1}{c} \sum_sample \frac{(M_i / \bar{M})^2 (\bar{y}_i - \hat{\mu}_R)^2}{c-1} + \frac{f_1}{c} \sum_sample \frac{(M_i / \bar{M})^2 (1-f_{2i})s^2_{2i}}{c m_i} \]

If number of sampled clusters is reasonably large, then $\hat{\mu}_R$ is approximately normally distributed.

Note: If $f_1$ is very small, then $\hat{V}(\hat{\mu}_R) ~ s_1^2 / c$ . This result holds for more general subsampling schemes than SRS; only need a scheme with unbiased sample mean. Using systematic sampling is common.

Sampling with probability proportional to size

Similar to PPS for cluster sampling, and if $f_1$ is small can pretend we are sampling with replacement and treat clusters like individuals.

Performance is similar to SRS subsampling (but need to know $M_i$ for every cluster). Is intuitively appealing because if we take same-sized sample from every cluster then every unit has same chance of being selected.

Equal cluster sizes

If $M_i = M$ for all clusters then both estimators reduce to mean of cluster means.

If $m_i = m$ as well then variance reduces to:

$\hat{V}(\hat{\mu}_R) = (1-f_1)\frac{s_1^2}{c} + f_1(1-f_2)\frac{s_2^2}{cm}$ , where $s_1^2 = \sum_c \frac{(\bar{y}_i - \bar{y}) ^2}{c-1}$ , and $s_2^2 = \sum_c \frac{(y_ij - \bar{y}_{i.})^2}{m-1}$ .

Can obtain these results from a standard one-way ANOVA where $s_1^2 = s_B^2 / m$ and $s_2^2 = s_w^2$

Estimating proportions

As with cluster sampling, formulae don’t simplify much. See formula sheet for details.

Optimal sub-sample sizes

For simplicity, we’ll only deal with equal cluster and sample sizes, when all estimators reduce to $\bar{y}$ . Suppose cost = $k_1 c \times k_2 cm$ . Variance of $\bar{y} = (1-f_1)\frac{\sigma^2_1}{c} + (1-f_2)\frac{\sigma^2_2}{cm}$ . Minimised when $m = \sqrt{\frac{k_1}{k_2}}\left( \frac{\sigma_2}{\sigma_u} \right)$ .

Stratified multistage sampling

In most large surveys first-stage sample will be stratified. Introduces no new problems, use results results above to estimate mean and se for each clutser, then weighted average to get overall results.