Repeated measures

Any dataset in which subjects are measured repeatedly over time can be described as repeated measure data. Can be made at pre-determined times or in an uncontrolled fashion. This will determine the types of analysis available.

Fixed effect approaches

Mixed model approaches

Advantages:

There are several ways to use mixed models. The simplest is to use a random effects model with patient effects fitted as random. This allows for constant correlation between all observations on same patient – but this is often not the case. Use a covariance pattern or random coefficients model instead.

Covariance pattern models

Define covariance structure directly rather than using random effects. Observations within each category of a blocking effect assumed to have same covariance structure. Defined as block diagonal matrix $\mathbf{R} = .

Covariance patterns

Large selection of covariance patterns available. Most depend on observations being taken at fixed times, and some are more easily justified when observations are evenly spaced. Some patterns take into account exact value of time, and are best used when intervals are irregular.

Some simple patterns are:

If variability in a measurement is different at each time point, we can obvious heretogeneous generalisations. Can also fit a separate covariance structure to each treatment group.

If widely separated observations appear to be uncorrelated, can create a banded covariance structure by setting all covariances more than $t$ steps apart to 0. Can be done to any covariance pattern, and reduces number of variance components to be estimated.

Covariance patterns using the exact separation in time also exist (eg. power $r_ijk = \sigma^2 \rho^{d_ijk}$ ). Useful when time points do not occur at fixed intervals. Most cause covariance to decrease exponentially with increasing distance.

Which covariance pattern should be used?

Want to choose the covariance pattern that best fits the true covariance pattern – not easy! Increasing number of parameters will improve fit, but will lead to over-fitting. Can test using likelihood-ratio test (provided models are nested), or by comparing goodness of fit statistics adjusted for number of parameters (eg. AIC = $log(L) - q$ or BIC = $log(L) - q ).

Which patterns should be considered? Not usually practical to test large numbers. Can either start with simplest and work up, or use general to get some idea of what it should look like. Often covariance pattern will make little difference to estimate of treatment effect and standard errors, in this case compound symmetry is reasonable (roughly check by comparing to general) or you could just use the empirical estimator.

General points

Missing data occurs frequently in repeated measures experiments. Less of a problem with specified covariance structure as patient with only a few observations still influence others through the covariance pattern. Still need to be careful.

Significance testing should be performed using F-tests and Satterthwaites df. If Satterthwaites not available, use patient df to compare treatment effects (will be conservative).

Fixed effect standard estimates will be biased downward as the covariance parameters are estimated, not known. Can use robust ‘emprical’ variance estimator as described previously, but this will ignore model specification and use covariance pattern from data.

Residuals assumed $~N(\mathbf{0}, \mathbf{R})$ . Difficult to check formally, but plots of residuals should be sufficient to identify any major outliers or deviations from normality.

Random coefficients models

Random coefficient models develop an explicit relationship between the measurement and time. The most common model is linear with time, and interested if the slope differs between treatment groups. Usually fit time and treatment.time as fixed effects and patient and patient.time as random to allow patients to randomly vary around the treatment mean.

Fitting polynomial models to the data can be accomplished by successively adding polynomials of higher order (as fixed and random effects) until the variance component becomes negative (for random effects) or effect becomes non-significant (fixed effects).

General points

If negative variance component estimate obtained, refit the model without that component. However, not all software will produce negative variance components. In SAS non-convergence or a non-semi-definite G-matrix are signs of a negative variance component. In this case, remove components in order of complexity until problem resolved.

Sample size estimation

Often calculated as simple-between patient trial, because of correlation between repeated measures, actually require fewer patients. Obviously won’t know covariance pattern in advance, but compound symmetry pattern will probably be adequate. If no estimate of within-patient correlation is known, a conservative prediction of correlation could be used.

$Var(\sum_j y_ij) = m \sigma^2_p (1 + (m -1)\rho)$ $Var(\bar{y}_i) = \sigma^2_p ( 1 + (m-1)\rho)/m$ $SE(t_i - t_j) = 2\sqrt{Var(\bar{y}_i)}/n$

with $m$ repeated measurements, $n$ patients in each group, $\sigma^2_p$ between patient variance, $\rho$ correlation between measurements.