Repeated Measures

Objectives

This chapter explores repeated measures, which allow us to relax the independence assumption.

Understand when repeated measures are appropriate.
Learn about correlation structures, how to visualize them, and select the appropriate structure.
Explore generalized least squares (GLS) and weighted least squares (WLS).
Follow a repeated measures workflow to structure analysis.

What is Repeated Measures?

Family Tree of Linear Models

flowchart TD
  A["Linear models<br>(constant variance,<br>normality assumptions)"] --> B["Errors are<br>independent"]
  B --> D["*t*-tests"]
  B --> E["ANOVA"]
  B --> F["SLR"]
  A --> C["Errors are<br>correlated"]
  C --> G["Time series"]
  C --> H["Repeated measures"]

  %% Arrow styling
  linkStyle default stroke:#333,stroke-width:1.5px;

  %% Node styling
  style A fill:#d1c4e9,stroke:#512da8,stroke-width:1.25px,color:#000
  style B fill:#bbdefb,stroke:#1976d2,stroke-width:1.25px,color:#000
  style C fill:#ffcdd2,stroke:#c62828,stroke-width:1.25px,color:#000

  style D fill:#e3f2fd,stroke:#64b5f6,stroke-width:1.25px,color:#000
  style E fill:#e3f2fd,stroke:#64b5f6,stroke-width:1.25px,color:#000
  style F fill:#e3f2fd,stroke:#64b5f6,stroke-width:1.25px,color:#000

  style G fill:#ffebee,stroke:#ef5350,stroke-width:1.25px,color:#000
  style H fill:#ffebee,stroke:#ef5350,stroke-width:1.25px,color:#000

Repeated Measures

Linear models assume constant variance and normality with independent errors.
- Examples: t-tests, ANOVA, simple linear regression (SLR)
When errors are correlated:
- Examples: time series, repeated measures
- Ignoring correlation can produce misleading p-values and confidence intervals.
- We need to model the correlation among errors (i.e., include additional parameters in the model) for valid inference and better predictions.
Repeated measures refers to multiple measurements on the same subject (i.e., dependent observations).
- For example, if a subject starts with high (or conversely low) values, their measurements tend to remain high (or low) over time.

Common Hypotheses in Repeated Measures:

Compare every time point to a baseline (i.e., control), for each group (e.g., asymptomatic vs. symptomatic).
Compare each time point between groups. (Baseline may not differ, but other time points might.)

Regression Perspective

Treat time as a categorical variable because measurements are made at discrete time points:

\[ Y = \text{Time} + \text{Status} + \text{Time} \times \text{Status} + \varepsilon \]

Include an interaction since trends depend on status.
Regression and contrasts help compare groups—we just need to model correlated errors.

Identifying Repeated Measures

Ask how the data were collected.
Repeated measures can apply to:
- Repeated measures one-way ANOVA
- Repeated measures MLR

Correlation Structures

Assessing Error Dependence

Some methods include residual diagnostic tools to visualize correlated errors.
Similar tools exist for repeated measures, but they aren’t as commonly used.
The key idea is to explain how residuals are correlated using a correlation structure.

Correlogram (pseudo-code approach)

Obtain residuals from a regular MLR (some from the same subject).
For residuals that are one unit apart in time:
- Create a scatterplot (earlier time point on the \(x\)-axis, later time point on the \(y\)-axis).
- Compute the correlation, and store the result.
Repeat for residuals that are two, three, four, \(\dots, k\) units apart.
Plot the correlation values against the time lag.

Takeaway

Correlation tends to decrease as the distance between residuals increases and eventually levels off. Large correlations at short time lags are a clear indication of correlated errors.

Correlogram Interpretation

Observations closer in time are more similar (i.e., more correlated).
Observations farther apart in time may still be mildly correlated.

Common Correlation Structures

Correlation structures are theoretical models that describe the expected trend in the correlogram.

Compound symmetry (CS)

Correlation is constant (i.e., flat, horizontal line) regardless of time.
Often used when time ordering is not meaningful (e.g., students within the same school).

Autoregressive (AR(1))

High correlation for nearby time points.
Correlation decreases as time lag increases.
Eventually approaches zero (like independent errors).

Gaussian

Similar to AR(1), but the drop-off in correlation is even faster.
Often used when time is continuous.

Generalized Least Squares (GLS)

GLS generalizes OLS technique for MLR to handle correlated errors.
You specify:
- A model with response and predictors
- A correlation structure (parameters estimated by software)
GLS updates both:
- Regression coefficients
- Standard errors
These estimates are more reliable than standard MLR if the chosen correlation structure is approximately correct.

Estimating Correlation in Practice

Variograms and semivariograms (often used in spatial models) can be used as visual alternatives to correlograms.
In practice, repeated measures datasets may be too small to visualize correlation clearly through the high variability.
Analysts typically:
- Use theoretical justification for choosing a structure.
- Fit multiple structures and compare using AIC (like feature selection).
- Use correlograms as a visual guide.
Instructor note (Turner’s experience):
- Compound symmetry works well for biological or human-based data.
- Use structures with rapid decay (e.g., Gaussian) sparingly unless time points are equally spaced and numerous.

Repeated Measures Workflow

Identify that you are in a repeated measures setting.
Perform EDA based on the equivalent MLR framework (determine the general theme: ANOVA, SLR, or MLR):
- Use boxplots or mean plots in ANOVA-style settings.
- Use scatterplots of predictor vs. response in SLR-style settings.
- Combine approaches for general MLR settings.
- Apply feature selection tools from MLR to assess model complexity and the bias-variance trade-off.
Residual diagnostics:
- Check for normality and constant variance using residual plots.
Update MLR to GLS:
- Try multiple correlation structures.
- Use AIC to select the best-fitting model.
- Optionally use a correlogram to guide your choice.
Conduct inference on regression coefficients:
- Report p-values and confidence intervals.
- Results will be more trustworthy when correlation is accounted for.

Additional Notes on Correlation Structures

Some correlation structures require a time component:
- Autoregressive (AR; exponential decay)
- Gaussian
- Linear
- Spherical
- Matérn (A general class that includes AR and Gaussian as special cases; more flexible but may overfit.)
Some correlation structures do not require a time component:
- Compound symmetry (CS)
- Variance components
Examples that don’t involve time:
- Texas STAR exam: multiple test scores from the same schools/school districts
- Hereditary studies: family studies where siblings or parents/offspring are clustered

Technical Details

Matrix Representation of MLR

\[ Y = X\beta + \varepsilon \]

where:

\(Y\) is the response vector.
\(X\) is the design matrix.
\(\beta\) is the regression coefficient vector.
\(\varepsilon\) is the error vector.

The model assumes properties of the errors:

\[ \varepsilon \sim \mathcal{N}(0, \sigma^2 I) \]

that is:

\(\varepsilon\) is multivariate normal
where \(\mathcal{N}\) denotes the multivariate normal distribution (MVN)
Mean vector = \(0\), a \(n x 1\) matrix of \(0\)s
Covariance matrix = \(\sigma^2 I\)

The covariance matrix:

\[ \Sigma = \begin{bmatrix} \sigma^2 & 0 & 0 \\ 0 & \sigma^2 & 0 \\ 0 & 0 & \sigma^2 \\ \end{bmatrix} = \sigma^2I \]

Interpretation:

Each individual error has the same variance \(\sigma^2\)
Traditional multiple linear regression (MLR) assumptions:
- Errors are independent, which implies they are also uncorrelated, so all off-diagonal covariances are 0.

Generalized Least Squares (GLS) and Correlated Errors

In repeated measures models, we no longer assume that errors are uncorrelated.

To account for correlated errors, we extend the standard MLR assumptions using generalized least squares (GLS). This involves specifying a more flexible error structure.

We now assume:

\[ \varepsilon \sim \mathcal{N}(0, \sigma^2 R) \]

where:

\(\varepsilon\) is multivariate normal with mean zero.
\(\sigma^2 R\) is the variance–covariance matrix of the errors.
\(R\) is the correlation matrix, which specifies how errors are related (e.g., through time, subject grouping, etc.).
When \(R = I\), we recover the standard MLR case with independent errors.

This updated formulation allows us to model within-subject correlation (e.g., repeated observations) by directly encoding the assumed pattern of correlation into \(R\).

Visualizing the Correlation Structure in \(\sigma^2 R\)

We can visualize the error term in repeated measures models as:

\[ Y = X\beta + \varepsilon, \quad \varepsilon \sim \mathcal{N}(0, \sigma^2 R) \]

Each row in \(\varepsilon\) corresponds to a specific subject and time point:

Subject	Time	Error Term
1	T1	\(\varepsilon_1\)
1	T2	\(\varepsilon_2\)
1	T3	\(\varepsilon_3\)
2	T1	\(\varepsilon_4\)
2	T2	\(\varepsilon_5\)
2	T3	\(\varepsilon_6\)

The corresponding variance–covariance matrix \(\sigma^2 R\) has a block-diagonal structure and might look like this:

\[ \sigma^2 R = \sigma^2 \begin{bmatrix} 1 & \rho_{12} & \rho_{13} & 0 & 0 & 0 \\ \rho_{12} & 1 & \rho_{23} & 0 & 0 & 0 \\ \rho_{13} & \rho_{23} & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & \rho_{45} & \rho_{46} \\ 0 & 0 & 0 & \rho_{45} & 1 & \rho_{56} \\ 0 & 0 & 0 & \rho_{46} & \rho_{56} & 1 \\ \end{bmatrix} \]

Key points:

The upper-left 3×3 block corresponds to Subject 1 and correlations among \(\varepsilon_1\), \(\varepsilon_2\), \(\varepsilon_3\).
The lower-right 3×3 block corresponds to Subject 2 and correlations among \(\varepsilon_4\), \(\varepsilon_5\), \(\varepsilon_6\)
Off-diagonal correlations exist within subjects (e.g., \(\rho_{12}, \rho_{13}, \rho_{23}\) for subject 1).
Between-subject correlations are zero, indicated by the block-diagonal structure.
The values of the \(\rho\) terms are determined by the assumed correlation structure (e.g., compound symmetry, AR(1), etc.).

This block structure is how repeated measures models encode dependency within subjects but maintain independence between subjects.

Correlation Structure Matrices

Different correlation structures determine the form of the matrix \(R\) in the repeated measures model.

Autoregressive (AR(1))

This structure assumes correlation decreases with increasing time lag:

\[ \sigma^2 \begin{bmatrix} 1 & \rho & \rho^2 & \rho^3 \\ \rho & 1 & \rho & \rho^2 \\ \rho^2 & \rho & 1 & \rho \\ \rho^3 & \rho^2 & \rho & 1 \\ \end{bmatrix} \]

High correlation for adjacent time points
Correlation decays exponentially with time separation

Compound Symmetry (CS)

This structure assumes constant correlation across all time points:

\[ \sigma^2 \begin{bmatrix} 1 & \rho & \rho & \rho \\ \rho & 1 & \rho & \rho \\ \rho & \rho & 1 & \rho \\ \rho & \rho & \rho & 1 \\ \end{bmatrix} \]

Often used when time ordering is less meaningful or measurements are equally spaced
Simpler structure, fewer parameters

Gaussian

Similar to AR(1), but correlation drops off more quickly
Often used when time is continuous or spacing between time points varies

Spherical Power

Same mathematical form as AR(1), just different context or terminology
Useful when modeling spatial or continuous time-based correlation

Model Fitting

Fitting repeated measures models typically uses restricted maximum likelihood (REML), a technique that improves estimation of variance components.

During model fitting, the software estimates:

\(\sigma^2 R\): the variance–covariance matrix of the errors
\(\beta\): the regression coefficient vector

The estimate of \(\beta\) under GLS is:

\[ \hat{\beta} = \left(X^T R^{-1} X\right)^{-1} X^T R^{-1} Y \]

This formula generalizes the OLS estimator
When \(R = I\), it reduces to the standard OLS solution

Variance of the GLS Estimator

In addition to estimating the coefficients \(\hat{\beta}\), we can also estimate their variances using the expression:

\[ \widehat{\text{Var}}(\hat{\beta}) = \hat{\sigma}^2 \left( X^T R^{-1} X \right)^{-1} \]

This is the generalized version of the variance formula used in OLS.

When \(R = I\), this reduces to the traditional multiple linear regression (MLR) variance estimator:

\[ \widehat{\text{Var}}(\hat{\beta}) = \hat{\sigma}^2 \left( X^T X \right)^{-1} \]

Thus, the MLR estimator is a special case of GLS where the correlation structure is identity (i.e., no correlation among errors).

Summary: MLR as a Special Case of GLS

The general linear model framework used in GLS relaxes the independence assumption of traditional MLR and allows for arbitrary correlation structures among errors (via the matrix \(R\)).

MLR is a special case of GLS when the correlation structure is \(R = I\).
The structure of \(R\) determines how residuals are related (e.g., repeated measurements, clustered observations).
Under GLS, we estimate:
- Regression coefficients \(\hat{\beta}\)
- Standard errors of \(\hat{\beta}\)
- The correlation structure (via estimated parameters in \(R\))

Key idea: Accounting for correlation leads to valid statistical inference and improved prediction.

Weighted Least Squares (WLS)

GLS can also be applied when independence holds, but the assumption of constant variance is violated (i.e., heteroscedasticity).

In this case, the error covariance matrix no longer looks like \(\sigma^2 I\). Instead, each observation has its own variance:

\[ \Sigma = \begin{bmatrix} \sigma_1^2 & 0 & 0 \\ 0 & \sigma_2^2 & 0 \\ 0 & 0 & \sigma_3^2 \\ \end{bmatrix} \]

Each variance term \(\sigma_i^2\) is unique.
Independence still holds (covariances are 0), but the variances are not constant across observations.

We can re-express each unique variance in terms of a common variance component and a weight:

\[ \sigma_1^2 = \frac{1}{w_1} \sigma^2,\quad \sigma_2^2 = \frac{1}{w_2} \sigma^2,\quad \sigma_3^2 = \frac{1}{w_3} \sigma^2 \]

Each \(w_i\) is a weight term that serves as a multiplicative factor determining how different the variance of each observation is from the baseline \(\sigma^2\).

Weighted Error Structure in Matrix Form

Using the weights, the variance–covariance matrix of the errors can be written as:

\[ \Sigma = \begin{bmatrix} \sigma_1^2 & 0 & 0 \\ 0 & \sigma_2^2 & 0 \\ 0 & 0 & \sigma_3^2 \\ \end{bmatrix} = \begin{bmatrix} \frac{\sigma^2}{w_1} & 0 & 0 \\ 0 & \frac{\sigma^2}{w_2} & 0 \\ 0 & 0 & \frac{\sigma^2}{w_3} \\ \end{bmatrix} = \sigma^2 \begin{bmatrix} \frac{1}{w_1} & 0 & 0 \\ 0 & \frac{1}{w_2} & 0 \\ 0 & 0 & \frac{1}{w_3} \\ \end{bmatrix} = \sigma^2 R \]

This is a special case of GLS where the correlation matrix \(R\) is diagonal.
WLS is GLS with unequal variances but uncorrelated errors.

We can use GLS techniques to obtain good estimates of the regression coefficients and their standard errors under this structure.

Estimating Weights in Practice

You can only apply WLS if you know (or can estimate) the weights. There are numerous strategies for estimating the weights, but the basic approach follows a common pattern:

Fit a regular regression model.
Plot the absolute values of your residuals against:
- One of your predictors that you suspect may explain the changing variance, or
- The predicted values from your model
  Look for a pattern or trend in the spread of the residuals.
Fit a regression model to the absolute value of the residuals to estimate the mean behavior (i.e., trend) of the variability.
Use the predicted values from that model to compute weights:
\[ w_i = \frac{1}{(\widehat{\text{predicted}})^2} \]

Code

weight.model = lm(abs(model1$residuals) ~ data1$x)
wts = 1 / fitted(weight.model)^2
wls.model = lm(y ~ x, data1, weights = wts)

Interpretation

Observations with higher variance are down-weighted.
The fitted line will more closely follow observations with lower variance.
The biggest difference is in the standard error estimates of the regression coefficients.

When to Use WLS

When transformations do not fix the constant variance problem.
When an exotic transformation works but makes the model hard to interpret.

Additional Notes

Raw residual vs. fitted plots may look the same for OLS and WLS.
Use studentized residuals to check whether the variance has been stabilized.