Differences between revisions 30 and 31

Longitudinal Statistics

This page describes ways of analyzing longitudinal data after processing it using the longitudinal stream in Freesurfer.

Longitudinal data are more complex than cross-sectional data, as repeated measures are correlated within each subject. The strength of this correlation will depend on the time separation between scans. In addition extra care must be taken when the data exhibit significant between-subject variation in number of time points and between-scan intervals (imperfect timing). A statistical analysis should then consider these data features in order to obtain valid statistical inferences.

Freesurfer currently comes with (at least) three different frameworks for the analysis of longitudinal data:

Simplified repeated measures ANOVA (ignores correlation and timing of the measurement occasions)
Direct analysis of atrophy rates or percent changes (ignores correlation and single time points)
Linear mixed effects models <-- recommended, but more complex

Simplified Repeated Measures ANOVA

This method can be used to check for differences between individual time points or compare time point differences across groups. For two time points it simplifies to a PairedAnalysis.

Advantages:

Does not assume any specific trend in the mean response over time and thus can capture complex trajectories.
Can make use of different multiple comparisons methods that come with mri_glmfit.

Disadvantages:

Does NOT consider the correlation among the repeated measures, and thus, there is a significant reduction in statistical power.
Does NOT consider the timing of the measurement occasions which may result in a further reduction in power.
Can only be applied to balanced data (all subject have their scans acquired at the same set of measurement occasions) with a small number of repeated measures (<=3.

For details see: RepeatedMeasuresAnova

Analysis of Rates or Percent Changes

To analyze, e.g. annualized percent change or atrophy rates for 2 or more time points, one can run a two stage model. This avoids dealing with the longitudinal correlation. The two stages are:

First, simplify the statistic to a single number for each subject (the difference of two time points, or the slope of the fitting line, or the annualized percent change, etc...).
Then analyze the obtained summary measure across subjects or groups with a standard GLM.

This model is quite simple and can be a good choice if all subjects have the same number of time points. Linear fits into each subject data are often meaningful, as longitudinal change is almost linear within a short time frame of a few years.

Advantages:

Modeling the correlation structure can be avoided.
Can deal with differently spaced time points.
Works on ROI stat (e.g. aseg.stats or aparc.stats) and on cortical maps (e.g. thickness).
The second stage can be performed with QDEC (simple GUI) or directly with mri_glmfit.
The second stage analysis can make use of different multiple comparisons methods that come with mri_glmfit.
Scripts are available ( long_mris_slopes and long_stats_slopes ), no matlab needed.
For the simple case of two time points and when looking at simple differences this model simplifies to a paired analysis, but can additionally compute (symmetrized) percent changes.
Includes code for intersecting cortex labels (across time and across subjects) to make sure that all non-cortex measures are excluded.

Disadvantages:

Does NOT model the correlation among the repeated measures, and thus, there is a significant reduction in statistical power.
Does NOT account for different certainty of within subject slopes depending on the number of time points and therefore it has the highest propensity to false positives (type I family wise error in the mass-univariate setting).
Difficult to model non-linear temporal behaviour.
Difficult to deal with time varying co-variates (slopes would need to be fit into those for each subject to reduce these to a single number).
Cannot include information from subjects with only a single time point and thus the results are likely to be biased. This also results in a further reduction in statistical power.

The linear mixed effects model overcomes these limitations and should be used if subjects have differently many time points (or for more complex modeling).

For details see: LongitudinalTwoStageModel

Linear Mixed Effects Model

A Linear Mixed Effects (LME) model is the most powerful and principled approach.

Advantages:

Works for both stats (univariate) and surface analysis (mass-univariate).
Can handle imperfect timing and different number of time points across subjects (missing data).
Even subjects with only a single time point can be included into these models (make sure they also run through the longitudinal stream, available with version FS 5.2, to avoid a bias due to different processing)
Appropriately models the temporal correlation.
Can model different variances across measurement occasions.
Our mass-univariate method can deal very well with the spatial correlation among measurements on the cortex and is very fast by working with spatial regions.
Can be used to model more complex longitudinal behavior (e.g. quadratic, or piecewise linear trajectories) and time-varying covariates.
It must be kept in mind that because longitudinal mixed-effects model tools are now publicly available it is likely that journal reviewers will demand those appropriate statistical models for your longitudinal studies.

Disadvantages:

More complicated use (eg. requires distinguishing mixed effects, fixed effects ...).
Currently our implementation is in Matlab.
Only offers FDR for multiple comparisons correction.

For details see: LinearMixedEffectsModels

MartinReuter

-  ⇤ ← Revision 30 as of 2012-12-05 01:40:57 → 
  Size: 5774
  Editor: jbernal
  Comment:
+   ← Revision 31 as of 2012-12-05 01:48:25 → ⇥
  Size: 6135
  Editor: jbernal
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 20:
- * does not assume any specific trend in the mean response over time and thus can capture complex trajectories.
 * can make use of different multiple comparison methods that come with mri_glmfit.
+ * Does not assume any specific trend in the mean response over time and thus can capture complex trajectories.
 * Can make use of different multiple comparisons methods that come with mri_glmfit.
 Line 25:
- * does NOT consider the correlation among the repeated measures, and thus, there is a significant reduction in statistical power.
 * does NOT consider the timing of the measurement occasions which may result in a further reduction in power. 
 * can only be applied to balanced data (all subject have their scans acquired at the same set of measurement occasions) with a small number of repeated measures (<=3).
+ * Does NOT consider the correlation among the repeated measures, and thus, there is a significant reduction in statistical power.
 * Does NOT consider the timing of the measurement occasions which may result in a further reduction in power. 
 * Can only be applied to balanced data (all subject have their scans acquired at the same set of measurement occasions) with a small number of repeated measures (<=3.
 Line 36:
-. first, simplify the statistic to a single number for each subject (the difference of two time points, or the slope of the fitting line, or the annualized percent change, etc.)
 2. then analyze the obtained summary measure across subjects or groups with a standard GLM.
+. First, simplify the statistic to a single number for each subject (the difference of two time points, or the slope of the fitting line, or the annualized percent change, etc...).
 2. Then analyze the obtained summary measure across subjects or groups with a standard GLM.
 Line 42:
- * modeling the correlation structure can be avoided (at the cost of significant reduction in power). 
 * can deal with differently spaced time points
 * works on ROI stat (e.g. aseg.stats or aparc.stats) and on cortical maps (e.g. thickness)
 * the second stage can be performed with QDEC (simple GUI) or directly with mri_glmfit
 * the second stage analysis can make use of different multiple comparison methods that come with mri_glmfit
 * scripts are available ( long_mris_slopes and long_stats_slopes ), no matlab needed
 * for the simple case of two time points and when looking at simple differences this model simplifies to a paired analysis, but can additionally compute (symmetrized) percent changes 
 * includes code for intersecting cortex labels (across time and across subjects) to make sure that all non-cortex measures are excluded
+ * Modeling the correlation structure can be avoided. 
 * Can deal with differently spaced time points.
 * Works on ROI stat (e.g. aseg.stats or aparc.stats) and on cortical maps (e.g. thickness).
 * The second stage can be performed with QDEC (simple GUI) or directly with mri_glmfit.
 * The second stage analysis can make use of different multiple comparisons methods that come with mri_glmfit.
 * Scripts are available ( long_mris_slopes and long_stats_slopes ), no matlab needed.
 * For the simple case of two time points and when looking at simple differences this model simplifies to a paired analysis, but can additionally compute (symmetrized) percent changes. 
 * Includes code for intersecting cortex labels (across time and across subjects) to make sure that all non-cortex measures are excluded.
 Line 52:
- * does NOT model the correlation among the repeated measures, and thus, there is a significant reduction in statistical power.
 * does NOT account for different certainty of within subject slopes depending on the number of time points and therefore it has the highest  propensity to false positives (type I family wise error in the mass-univariate setting).
 * difficult to model non-linear temporal behaviour.
 * difficult to deal with time varying co-variates (slopes would need to be fit into those for each subject to reduce these to a single number).
 * cannot include information from subjects with only a single time point and thus the results are likely to be biased. This also results in a further reduction in statistical power.
+ * Does NOT model the correlation among the repeated measures, and thus, there is a significant reduction in statistical power.
 * Does NOT account for different certainty of within subject slopes depending on the number of time points and therefore it has the highest  propensity to false positives (type I family wise error in the mass-univariate setting).
 * Difficult to model non-linear temporal behaviour.
 * Difficult to deal with time varying co-variates (slopes would need to be fit into those for each subject to reduce these to a single number).
 * Cannot include information from subjects with only a single time point and thus the results are likely to be biased. This also results in a further reduction in statistical power.
 Line 66:
-A Linear Mixed Effects (LME) model is the most powerful approach
+A Linear Mixed Effects (LME) model is the most powerful and principled approach.
 Line 69:
- * can deal well with differently many time points
 * even subjects with only a single time point can be included into these models (make sure they also run through the longitudinal stream, available with version FS 5.2, to avoid a bias due to different processing)
 * considers the temporal correlation and works for stats (univariate) or surface analysis (mass-univariate)
 * our mass-univariate method can deal very well with the spacial correlation of measures on the cortex and is very fast by working with spacial regions
 * can be used to model more complex longitudinal behavior (e.g. quadratic, or piecewise linear trajectories) and time-varying covariates
+ * Works for both stats (univariate) and surface analysis (mass-univariate).
 * Can handle imperfect timing and different number of time points across subjects (missing data).
 * Even subjects with only a single time point can be included into these models (make sure they also run through the longitudinal stream, available with version FS 5.2, to avoid a bias due to different processing) 
 * Appropriately models the temporal correlation.
 * Can model different variances across measurement occasions.
 * Our mass-univariate method can deal very well with the spatial correlation among measurements on the cortex and is very fast by working with spatial regions.
 * Can be used to model more complex longitudinal behavior (e.g. quadratic, or piecewise linear trajectories) and time-varying covariates. 
 * It must be kept in mind that because longitudinal mixed-effects model tools are now publicly available it is likely that journal reviewers will demand those appropriate statistical models for your longitudinal studies.
-Line 76:
+Line 80:
- * more complicated use (distinguish mixed effects, fixed effects ...)
 * currently our implementation is in Matlab
 * and only offers FDR for multiple comparision correction.
+ * More complicated use (eg. requires distinguishing mixed effects, fixed effects ...).
 * Currently our implementation is in Matlab.
 * Only offers FDR for multiple comparisons correction.