Useful SAS code
v = variable, k = keywords, o = options SAS user’s guide
Univariate statistics
PROC UNIVARIATE [o /];
VAR v; (variables to process) BY v’s; (sort by) CLASS v-1 v-2; (classify by) FREQ v; (variable represents frequencies) HIST [v’s]; (produce histogram) ID v-1 …; (observation identifier) OUTPUT [OUT = data set] keyword-1=name-1 …; (save to data-set) PROBPLOT v’s; (produce probability plot) QQPLOT v’s; (duh)
RUN;
Will produce everything you could ever possibly want
Options:
- NORMAL; for normality tests (use Shapiro-Wilk for n<2000, Kolmogorov-Smirnov for n >2000)
- PLOT; for crappy text graphics (stem and left, box plot, normal prob plot)
- FREQ; for frequency tables
Methods for continuous data
Student’s t-test commonly used for detecting differences means between two groups
Unpaired: PROC TTEST [o] VAR v’s; (variables to test) CLASS variable; (classification variable) BY v’s; (group by) FREQ v; (frequency variable) RUN;
Paired: PROC TTEST [o] PAIRED pairlist; (variables to compare) BY v’s; FREQ v; RUN;
Pairlist:
- (v’s)*(v’s) = compare every v in first list with every in 2nd
- (v’s):(v’s) = compare first in first with first in second etc.
- pairlist pairlist
Output: variables, n, mean, sd, mean se., t-tests, and folded f-test for equality of variances.
Wilcoxon rank sum
PROC NPAR1WAY [o]; VAR v’s; CLASS variable; BY v’s; FREQ v; EXACT statistic-groups; (compute exact p values) RUN;
One-way ANOVA
Same assumptions as t-test. Non-parametric test is Kruskal-Wallis test.
PROC GLM [o]; CLASS v’s; MODEL dependents=independents [/ o]; (model formula) RUN;
ABSORB v's; (absorb out variabe effects)
BY v's; FREQ v; ID v; WEIGHT v;
CONTRAST 'label' effect values ... ; (contrasts)
ESTIMATE 'label' effect values ...; (estimate linear combs of parameters)
LSMEANS effects; (essential for ANOVA!)
MANOVA ...;
MEANS effects;
OUTPUT ...;
RANDOM effects;
REPEATED;
TEST;
Prints multivariable equivalents of t-test outputs.
Scatter plots
PROC PLOT [o]; BY v’s; PLOT x * y; RUN;
Correlation
PROC CORR [o]; VAR v’s; (variables to correlate) PARTIAL v’s; (variables to calculate partial correlation coefficients for) RUN;
BY v's;
FREQ v;
WEIGHT v;
Regression
PROC REG [o]; MODEL y={x1 + x2} x3×4 / SELECTION = stepwise RUN;
BY v's; FREQ v; ID v's; VAR v's; WEIGHT w;
ADD v's; DELETE v's; (interactively add and delete variables from model)
MTEST... ; (test linear combination hypotheses)
OUTPUT...;
PAINT ...; PLOT ...; (generate scatter plots)
PRINT...;
RESTRICT...; (restrict parameters)
Produces typical shitload of output.
Introduction
What is a mixed model?
Generalisation of linear models where observations are not independent. Mixed models model the covariance structure of data.
Three basic types:
- random effects: certain effects arise from a distribution and add
- additional source of variation
- random coefficients: a covariate varies randomly, used for repeated
- measures to model relationship with time, different from random effects
- because estimates are correlated
- covariance pattern: directly models covariance structure
Why use mixed models?
- more appropriate fixed effect estimates (covariance pattern)
- to make inference over wider population
- to deal with missing data
- fixed effect estimates shrunken so extreme estimates less likely with
- small samples
- can account for heterscedasticity between treatment groups
Useful definitions
Containment
Basically the same as nesting (?).
Balance
When a design is balanced estimate effects will equal raw means. Balance occurs for a fixed effect when:
- within each category of fixed effect, observations have equal proportions
- in categories of every other effect fitted at the same level
- if a fixed effect is contained within a random effect, there are an equal
- number of observations at each category of the random effect
It’s also important identify when fixed effect means will differ depending on whether fixed or mixed model is used. They will be the same provided:
- within each category of the effect, observations have equal proportions
- in categories of every other random effect fitted at the same level
- if the effect is contained within a random effect, there are an equal
- number of observations at each category of the random effect
- if equal number of observations in every cell and no continuous
- covariates, then all fixed effects are balanced.
- missing observations give imbalance
- continuous effects give imbalance (balance across random effects may
- still occur)
Error strata
Error stratum defined by each random effect and by the residuals. The containment stratum of a fixed effect is defined as error strata that contains the effect – the residual stratum, unless contained within a random effect. Usually an effect has only one containment stratum, but can have more in more complicated situations.
Higher level strata are defined by random effects themselve contained within another random effect. If higher level strata are present and data are imbalanced across random effects, fixed effects will be estimated using information from the higher level strata as well.
The normal mixed model
Instead of tricky integrals, price contingent claims by simulating asset price paths.
If we generate random variates $X_1, ..., X_n$ indepedent with the distribution $X / S_0(T)$ then $\bar{X} = \frac{1}{n} \sum X_i$ estimates $\pi_X (0)$ with usual CLT confidence intervals.To generate the $X_i$ , need to simulate asset price paths $((S_1(t)), . With a Black-Scholes model $S_i (t) = exp((r- \frac{1}{2} , so we need to simulate Brownian motion.
What to do next depends on the type of claim. For example, with Euro options only need final value so only need $\tilde{W}_T ~ N(0, T)$ . But in general will need the whole price path. Will discretise into $m$ steps of size $\Delta t = T / m$ , then fill in other values via linear interpolation. This works well for complicated options. It doesn’t work well for American options because it doesn’t provide optimal strategy.
Variance reduction techniques
High accuracy required for financial application, usually want error to be less than 1 part per thousand. Standard errors $\in o(\sqrt{n})$ , so therefore useful to use variance reduction techniques.
Antithetic variates
The simplest of these is antithetic variates – if $(W_t)$ is a Brownian motion, then so is $(-W_t)$ . Then let $X = \frac{1}{2} (f(S_i^{(i')} (T)) . Variance will be reduced because prices from positive and negative paths are likely to be negatively correlated.
Control variates
Idea: Use known “model payoffs” $Y$ , similar to $X$ , with $E[Y] = \mu_Y$ . $\frac{1}{n} \sum (X_i + c(Yi - \mu_Y))$ , estimates $E[X]$ as $E[X + c(Y - .This will be better than simple sampling if $Var(X + c(Y - \mu_Y)) \lt , ie. $Var(X) + 2cCov(X,Y) + Var(Y) \lt Var(X)$ . A pilot sample may help to estimate the best value $c^{*}$ of $c$ . Alternatively could just take $c$ to be $-1$ , ie use $\frac{1}{n} \sum (X_i - Y_i + \mu_Y)$ . Relies on $Var(X-Y)$ being smaller than $Var(X)$ .
Another approach uses several model payoffs $Y^{(1)}, Y^{(2)}, ..., with $E[Y^{(j)}] = \mu_j$ . Then estimate $E[X]$ using $\frac{1}{n} \sum (X_i + \sum c_j(Y^{(j)}_i - \mu_j))$ . This works as above. You can find the $c_j$ ’s by regressing $X_i$ on $Y^{(1)}_i, ...$ .Importance sampling
Idea: Use $E_K [X \frac{dQ}{dR} ]$ , ie. simulate from distribution of $X under R rather than $X$ under $Q$ . Hopefully $Var_R (X will be smaller than $Var_Q (X)$ .
\[ Var_R(X \frac{dQ}{dR}) = E_R [ X^2 {(\frac{dQ}{dR})}^2] - \mu^2We want $\frac{dQ}{dR}$ to be small. (Note: $\frac{dQ}{dR} \gt 0$ and $E_Q ). Can do this with GBM(geometric brownian motion). If $S_1(t) = S_1(0) exp( (r-\frac{1}{2}\sigma^2)t \sigma we can find an equivalent measure $R$ s.t. $\frac{dR}{dQ} = , under which $\bar{W}_T = \tilde{W}_T - ct$ is a Brownian motion. So simulating the assert price under $R$ is just a matter of simulating a GBM.
Also:
\[ \frac{dQ}{dR} = exp (-\tilde{W}_T + \frac{1}{2} c^2 T)Repeated measures
Any dataset in which subjects are measured repeatedly over time can be described as repeated measure data. Can be made at pre-determined times or in an uncontrolled fashion. This will determine the types of analysis available.
Fixed effect approaches
- Analyse mean response over time: Satisfactory if overall treatment effect
- is of interest, and no missing data. Does not give any information on
- treatment time interaction.
- Separate analyses at each time point: treatment se correctly estimated at
- between-patient level. Multiple comparisons may lead to spurious
- significance, and tests may be correlated. Treatment se less accurate
- because only use observations at one time point.
- Analyse response factors: can generate summary values for each patient,
- must be careful with missing values, and multiple testing
- Analyse raw data with fixed patient effects: Gives same result as mixed
- model if no missing data. Treatment se hard to calculate.
Mixed model approaches
Advantages:
- single model can estimate overall treatment effects and effect at each
- time point
- treatment se at individual timepoints use all information available
- no problems with MAR missing data
- covariance pattern can be determined and taken account of
There are several ways to use mixed models. The simplest is to use a random effects model with patient effects fitted as random. This allows for constant correlation between all observations on same patient – but this is often not the case. Use a covariance pattern or random coefficients model instead.
Covariance pattern models
Define covariance structure directly rather than using random effects. Observations within each category of a blocking effect assumed to have same covariance structure. Defined as block diagonal matrix $\mathbf{R} = .
Covariance patterns
Large selection of covariance patterns available. Most depend on observations being taken at fixed times, and some are more easily justified when observations are evenly spaced. Some patterns take into account exact value of time, and are best used when intervals are irregular.
Some simple patterns are:
- general: no restrictions
- Toeplitz: constant variance, measurements $t$ steps apart have covariance
- $\theta_t$
- first order autoregressive: constant variance, measurements $t$ steps
- apart have correlation $\rho^t$
- compound symmetry: all variances = $\sigma^2$ , all covariances = $\theta$
- (equivalent to random effects model)
If variability in a measurement is different at each time point, we can obvious heretogeneous generalisations. Can also fit a separate covariance structure to each treatment group.
If widely separated observations appear to be uncorrelated, can create a banded covariance structure by setting all covariances more than $t$ steps apart to 0. Can be done to any covariance pattern, and reduces number of variance components to be estimated.
Covariance patterns using the exact separation in time also exist (eg. power $r_ijk = \sigma^2 \rho^{d_ijk}$ ). Useful when time points do not occur at fixed intervals. Most cause covariance to decrease exponentially with increasing distance.
Which covariance pattern should be used?
Want to choose the covariance pattern that best fits the true covariance pattern – not easy! Increasing number of parameters will improve fit, but will lead to over-fitting. Can test using likelihood-ratio test (provided models are nested), or by comparing goodness of fit statistics adjusted for number of parameters (eg. AIC = $log(L) - q$ or BIC = $log(L) - q ).
Which patterns should be considered? Not usually practical to test large numbers. Can either start with simplest and work up, or use general to get some idea of what it should look like. Often covariance pattern will make little difference to estimate of treatment effect and standard errors, in this case compound symmetry is reasonable (roughly check by comparing to general) or you could just use the empirical estimator.
General points
Missing data occurs frequently in repeated measures experiments. Less of a problem with specified covariance structure as patient with only a few observations still influence others through the covariance pattern. Still need to be careful.
Significance testing should be performed using F-tests and Satterthwaites df. If Satterthwaites not available, use patient df to compare treatment effects (will be conservative).
Fixed effect standard estimates will be biased downward as the covariance parameters are estimated, not known. Can use robust ‘emprical’ variance estimator as described previously, but this will ignore model specification and use covariance pattern from data.
Residuals assumed $~N(\mathbf{0}, \mathbf{R})$ . Difficult to check formally, but plots of residuals should be sufficient to identify any major outliers or deviations from normality.
Random coefficients models
Random coefficient models develop an explicit relationship between the measurement and time. The most common model is linear with time, and interested if the slope differs between treatment groups. Usually fit time and treatment.time as fixed effects and patient and patient.time as random to allow patients to randomly vary around the treatment mean.
Fitting polynomial models to the data can be accomplished by successively adding polynomials of higher order (as fixed and random effects) until the variance component becomes negative (for random effects) or effect becomes non-significant (fixed effects).
General points
If negative variance component estimate obtained, refit the model without that component. However, not all software will produce negative variance components. In SAS non-convergence or a non-semi-definite G-matrix are signs of a negative variance component. In this case, remove components in order of complexity until problem resolved.
- baseline measurement can be specified as a fixed effect (covariate) or
- as the first repeated measurement at time 0.
- patient estimates of slope and intercept can be calculated and will be
- “shrunken” towards population values
- significance testing proceeds as for covariance pattern models
- check residuals are normal by plotting against predicted values
- difficult to check assumption that random coefficients are distributed
- MVN, but usual plots should pick up any major
- departures
Sample size estimation
Often calculated as simple-between patient trial, because of correlation between repeated measures, actually require fewer patients. Obviously won’t know covariance pattern in advance, but compound symmetry pattern will probably be adequate. If no estimate of within-patient correlation is known, a conservative prediction of correlation could be used.
$Var(\sum_j y_ij) = m \sigma^2_p (1 + (m -1)\rho)$ $Var(\bar{y}_i) = \sigma^2_p ( 1 + (m-1)\rho)/m$ $SE(t_i - t_j) = 2\sqrt{Var(\bar{y}_i)}/n$with $m$ repeated measurements, $n$ patients in each group, $\sigma^2_p$ between patient variance, $\rho$ correlation between measurements.
Multicentre trials
A multicentre trial is carried out at several centres to increase patient numbers or to assess treatments in different settings. Sometimes increase variability in treatment effects due to centre differences (most likely in surgical trials)- can be included in model as centre and centre.treatment effects. Need to decide whether these effects should be fitted as fixed or random, based on what population you want to make inference about.
Important to check for any outlying centres – make indicate lapse of protocol. In fixed effects model, spurious outliers may be caused by small sample variation – random effects model does not have this problem.
Both fixed
Treatment effects: estimated with equal weight given to each centre. Cannot be estimated unless every treatment recieved at every centre. Treatment se: $Var(t_i - t_j) = \sigma^2_r ( 1 / n_i + 1 / n_j)$ . Always $\le$ variance with centre.treatment effects random Outlying centres: Determine from centre and centre.treatment means. Small centres may appear outlying because of random variation. Inference: Only to centres in trial.Centre effects fixed, centre.treatment effects omitted
Often used when fixed centre.treatment effect non-significant, but can be difficult to detect small centre.treatment effects, and variability in treatment effect within centres is often ignored.
Treatment estimates: Take account of differing centre sizes Treatment se: As above Outlying centres: As above Inference: As above, but extrapolation to other centres seems more reasonable if no centre.treatment effect
Both random
Centre.treatment effects retained, even if non-significant, provided $\sigma^2_ct \ge 0$ .Treatment estimates: Take account of differing centre sizes. Estimated from centre.treatment and centre error strata. Estimable even if some treatments not recieved in all centres. Treatment se: Based on centre.treatment variation. If equal number of patients recieve treatment at each centre, $Var(t_i - t_j) = 2(\sigma^2_r / . Not easily specified if centre sizes unequal. Outlying centres: Determined using shrunken centre and centre.treatment estimates Inference: Over population of possible centres.
Centre effects random, centre.treatment effects omitted
Useful when local estimates are required, because will get smaller treatment se.
Treatment estimates: Take account of differing centre sizes. Additional info from centre strata. Treatment se: Based on residual (within-centre) variation. If balanced, se has same formula as fixed effects case. Otherwise, expected to be less. Outlying centres: As above. Inference: Related to sampled centres only – only under strong assumption that no centre.treatment effect exists can inference be extended to entire population of centres.
Practical considerations
Plausibility of centre.treatment interaction
One approach to analysing multi-centre drug trials is to assume that there is no centre.treatment interaction, and if there is it is a fault of the study design. However, a drug effect can vary between centres because of different underlying populations, thus centre.treatment effects are always plausible.
Generalisation
Strictly speaking, shouldn’t generalise unless centres have been drawn at random from a population of interest. Rarely the case in practice (as with random selection of patients) – but usually do anyway, along with subjective judgements. Generally, multicentre trials are more generalisable than single-centre.
Number of centres
Accuracy of variance components dependent on number of centres used – if $\lt 5$ centres used, $\sigma^2_ct$ and treatment effect se may not be accurate. Main conclusions should be based on local results from fixed effect model – usually omitting centre.treatment termCentre size
Sometimes some centres pullout after only collecting a little information. Often combined into one in fixed effects analyses, but doesn’t make much difference with random effects models.
Negative variance components
As for the general case, can remove or set 0 (preferred). If both centre and centre.treatment are negative, then can drop both and analyse as a simple between patient trial.
Balance
If centre.treatment term is included, balance will only occur if same number of patients recieved treatment at every centre (unlikely!) – in this case treatment estimates will equal raw means. In more usual (unbalanced) situation means will differ between fixed and random models.
If centre.treatment term is omitted, balance will occur when same proportion of patients recieve treatment at each centre.
Sample size estimation
($c \times r \times t$ patients in total)
In balanced trial $var(t_i - t_j) = 2/c (\sigma^2_r /r + \sigma^2_ct)$ and estimates for $c$ and $r$ can be obtained from the usual sample size estimation equation: $\Delta = (t_{\alpha /2} + t_{1-\beta}) SE (t_i - . $DF = (c-1)(t-1)$ not always known, but can use normal-distribution instead of t to get initial value, then iterate.
One difficulty is that estimates of both $\sigma^2_R$ and $\sigma^2_ct$ are required. Can use results from previous (or trial) study, or make educated (conservative) guess.
If only know relative cost $g$ , the total cost will be $crt + cg$ which is minimised when $r = \sqrt{ \frac{ g\sigma^2_r}{t \sigma^2_ct}}$ which can then substituted into the above formula.
Other applications
Repeated measurements within visits
Often done when data collection methods are inaccurate/imprecise. Can just use summary statistics from each visit, but does not deal well with missing data or testing interaction. Same basic approach as normal repeated measurements: covariance pattern or random coefficients model.
Covariance pattern models
Several ways to model covariance, five reasonable model presented below:
- constant covariances: constant correlation between all observations on
- the same patient
- extra covariance for observations at the same visit: would probably
- expect measurements taken on same visit to be more correlated, so fit
- separate covariance between observations on same visit, and observations
- on different visits
- covariance pattern structured by visits: correlation bewteen observations
- different for each pair of visits
- covariance pattern structured by reps: possible correlation between
- observations at same visit depends differs depending on rep number
- extra covariance for observations on same reps
Many options – how to choose? If interested in pattern itself, start with simplest and build up. If interested in estimates of mean and standard errors, then second option probably adequate.
Random coefficients models
Again, more appropriate if interested in relationship with time, but now have two time scales, within and between visits, and can fit slopes to either or both of these.
Visit time: take reps as categorical fixed effects, not usually used for cross-over data Rep time: Both: not usually used for cross-over data
Matched case-control studies
Match group of subjects with disease to those without on age, sex, etc and determine which other factors differ between groups. Usually, take candidate risk factors as outcome variables, and matched set as random effect. Similar to cross-over trial.
Will be same as fixed effect model, when same number of controls for every case and positive matched set variance component. In fixed effects model information from group effects with all case or all control is completely lost. Another fixed effects approach is to fit the matching variables as covariates but otherwise ignore matching – this can cause bias if matching variables are associated with group.
With binary variables
If a variable is binary, then likely that several uniform categories for matched set effects. This may cause bias in random effects model and should be reparameterised as a covariance pattern model with compound symmetry. Preferable to conditional logistic regression where information lost on all tied matching sets.
Different variances for patient groups
In simple between-patient trial sometimes treatment groups have different variances – allowing for this will give more appropriate standard errors, and variance estimates may be useful for understanding treatment mechanisms. Easy to do – just allow $\mathbf{R}$ to have difference variances for each group (userepeated
statement in SAS, eg.
repeated int / subject=patient group=treat
).
Estimating variance components
Sometimes interested in just estimating variance components so that they can be used to help design future trials. Set $\Delta = (Z_{\alpha/2} + and then use standard variance formula $Var(t_i - t_j) = \sum_{\text{error strata}} \sigma^2 / n$ . Will usually need to determine via grid search.Small sample sizes
Mixed models useful, because shrinkage properties to reduce small sample variability.
Meta analysis
Useful/important when:
- many studies with insufficient sample sizes
- summarising results of many trials
- to help guide researches planning new trials
Most in medical literature are of clinical trials, recently epidemiological reanalysed as well.
Main purposes:
- increase statistical power
- resolve uncertainty between conflicting reports
- improve estimates of size of effect
- answer new questions not posed at start of original trials
- to balance “overflow of enthusiasm” after introduction of new clinical
- therapy
Practical steps:
- identify studies of a topic
- define criteria for inclusion
- abstract data from eligible studes
- analyse data
If original data are available then analyse like multicentre, with trials replacing centres. Trial.treatment variance larger than centre.treatment variance because of different protocols. Taking trial effects as random can increase accuracy of treatment estimates.
Meta-analysis usually done on binomial data – with normal data individual trials usually achieve desired conclusion.
Generalised linear mixed models
Many situations where residuals are not normally distribution - presence/absence data, count data etc. Generalised linear models and generalised mixed models relax the assumption of normally distributed residuals (to exponential-family distributed) for fixed effect and mixed models respectively.
Generalised linear models
GLM derived from simple fixed effects model by instead of making $\mu = , $g(\mu) = X \alpha$ . Can also think of link function as way of mapping data from their scale to the reals (the scale of normally distributed data).
Distributions
Binomial: parameter of interest proportion of sucesses, $f(y, n) = \frac{n! . Poisson: count data, $f(y) = \mu^y e^{-\mu}/y!$ Poisson with offset: counts per unit scale, $f(y, t) = (\mu t)^{yt} e^{-\mu
All members of exponential family of distributions: $f(y; \theta, \phi) = . $\theta$ , a function of the original parameters is known as the canonical link.
Model selection and measure of fit
Expressed by deviance $= 2(LR_full - LR_model)$ or generalised Pearson $\chi^2 = \sum (y_i - \hat{mu}_i) / V(\hat{\mu})$ , both asymptotically chi-squared. Pearson and deviance residuals defined as contribution of individual observations.Common problem with Poisson and Binomial is over-dispersion where variance is greater than assumed distribution. Can use quasi-likelihood model where we model the variance independently of the underlying model.
PROC MIXED
Uses REML, and Newton-Rhapson, first iteration uses Fisher scoring.
Syntax
PROC MIXED (options);
CLASS ...
MODEL ...
RANDOM ...
REPEATED ...
LSMEANS...
Convergence
Check model has converged, and criterion is close to 0. If -2 Res Log Like value is very large and negative, likely that covariance matrix is singular: results will be invalid, respecify model omitting random effects. If Hessian matrix not positive definite, may have reached local maxima and will need to try grid search
Asymptotic covariance matrix of estimates
Gives rough indication of correlation between variance components. Only asymptotically normal and only accurate with large DF. Usingempirical
option gives empirical estimates instead.
MODEL
statement
MODEL dependent = fixed effects / options
Use . for interactions. Need to define non-linear terms in
DATA
statement.
Useful options:
DDFM = satterthwaite
, use Satterthwaite approximation for- calculating degrees of freedom
SOLUTION
, output fixed effects estimatesCL
, give confidence intervals for fixed effect estimates
Random
statement
Used to specify random effects/coefficients $\mathbf{\beta}$ , and form of variance matrix $\mathbf{G}$ .
RANDOM random effects / options
Useful options:
SOLUTION
: print random effects solutionG
: print $\mathbf{G}$ matrixV
: print $\mathbf{V}$ matrixGROUP = effect
: estimate separate $\mathbf{G}$ matrix for- each group
SUBJECT = blocking effect TYPE = covariance pattern
: specify- block diagonal form for $\mathbf{G}$ matrix (eg. for random coefficients
- model:
SUBJECT=patient TYPE=UN
)
REPEATED
statement
Specifies correlation structure of residual matrix $\mathbf{R}$ , used for covariance pattern models.
REPATED repeated effect / SUBJECT = blocking effect TYPE = pattern
options
Covariance types:
UN
: unrestrictedTOEP
: toeplitzAR(1)
: first-order autoregressiveCS
: compound symmetrySP (sptype)
: spatial structure based on Euclidean distance- function
Useful options:
GROUP
: estimate separate covariance pattern for each groupR = subjects
: print $\mathbf{R}$ matrix for subjectsRCORR = subjects
: print correlation matrix
REPEATED
statement is used alone, the residual DF is
incorrect for between patient comparisons of fixed effects. If compound
symmetry structure required, can fit subjects (nested in treatment) as
RANDOM
effects.
LSMEANS
statement
Calculates least squares mean estimates of specified fixed effects.
LSMEANS fixed effects / options
Useful options:
CL
: give confidence intervalsDIFF
: print difference estimatesPDIFF
: prints p-values for difference estimates