Longitudinal Statistics

This page describes ways of analyzing longitudinal data after processing it using the longitudinal stream in Freesurfer.

Longitudinal data are more complex than cross-sectional data, as repeated measures are correlated within each subject. The strength of this correlation will depend on the time separation between scans. In addition, extra care must be taken when the data exhibit significant between-subject variation in number of time points and between-scan intervals (imperfect timing). A statistical analysis should then consider these data features in order to obtain valid statistical inferences.

Freesurfer currently comes with (at least) three different frameworks for the analysis of longitudinal data:

  1. Simplified repeated measures ANOVA (ignores correlation and timing of the measurement occasions)

  2. Direct analysis of atrophy rates or percent changes (ignores correlation and single time points)

  3. Linear mixed effects models <-- recommended (but more complex)


Simplified Repeated Measures ANOVA

This method can be used to check for differences between individual time points or compare time point differences across groups. For two time points it simplifies to a PairedAnalysis.

Advantages:

Disadvantages:

For details see: RepeatedMeasuresAnova


Analysis of Rates or Percent Changes

To analyze, e.g. annualized percent change or atrophy rates for 2 or more time points, one can run a two stage model. This avoids dealing with the longitudinal correlation. The two stages are:

  1. First, simplify the statistic to a single number for each subject (the difference of two time points, or the slope of the fitting line, or the annualized percent change, etc...).
  2. Then analyze the obtained summary measure across subjects or groups with a standard GLM.

This model is quite simple and can be an option if all subjects have the same number of time points, approximately equally spaced. Linear fits into each subject data are often meaningful, as longitudinal change can be assumed to be almost linear within a short time frame in several applications.

Advantages:

Disadvantages:

The linear mixed effects model overcomes these limitations and should be used if subjects have differently many time points (or for more complex modeling).

For details see: LongitudinalTwoStageModel


Linear Mixed Effects Model

A Linear Mixed Effects (LME) model is the most powerful and principled approach. We recommend this approach.

Advantages:

Disadvantages:

LinearMixedEffectsModels allow ROI analysis as well as advanced longitudinal analysis for cortical maps. Here we only discuss how to prepare your data for that analysis. The analysis itself is performed in matlab.

Similar to regular (cross sectional) processing, ROI stats data is contained in stats files (cf. the ROI tutorial). You could, e.g., open the stats text files in each tpN.long.templateID/stats/ dir, containing statistics such as volume of subcortical structures or thickness averages for cortical regions. These statistics can be fed into any external statistical packages to run whatever analysis you are interested in. Helpful commands to grab the data from all subjects and time points and create a single table are asegstats2table and aparcstats2table.

For example to create a table with subcoritical ROI's from all subjects and all time points you would run this :

asegstats2table --qdec-long long.qdec.table.dat --stats aseg.stats --tablefile aseg.table.txt

This will automatically grab the stats from the longitudinal directories (tpN.long.templateID/stats/) and create a table (rows: subject/time points, columns: structures). Similarly you can use aparcstats2table for surface ROI analysis.

To run LinearMixedEffectsModels on surface maps, you need to map all the data to a template (usually fsaverage) and smooth the data:

mris_preproc --qdec-long long.qdec.table.dat --target fsaverage --hemi lh --meas thickness --out lh.thickness.stack.mgh
mri_surf2surf --hemi lh --s fsaverage --sval lh.thickness.stack.mgh --tval lh.thickness.stack.fwhm10.mgh --fwhm-trg 10 --cortex --noreshape

For details see: LinearMixedEffectsModels


Longitudinal QDEC Table

QDEC is a graphical program to perform simple statistical analysis of cross sectional data. A QDEC table is a simple table in text format that contains subject ID's (one subject per row) and different co-variables per column (e.g. age, gender, diagnosis, …). The first row contains a header, where the first column header is fsid and the other columns are named according to their content. It is described in the QDEC Group Analysis tutorial. Note that QDEC currently cannot perform longitudinal statistics directly!

For the analysis of longitudinal data several command line tools require a 'longitudinal QDEC table'. This table is based on the QDEC table format with an additional 2nd column fsid-base that groups and assigns several time point to their subject.

To get the longitudinal data ready for statistical analysis (LongitudinalTwoStageModel or LinearMixedEffectsModels) you need to create a table (space separated as a text file) in the following format:

fsid

fsid-base

years

...

OAS2_0001_MR1

OAS2_0001

0

OAS2_0001_MR2

OAS2_0001

1.25

OAS2_0004_MR1

OAS2_0004

0

OAS2_0004_MR2

OAS2_0004

1.47

...

where the first column is called fsid (containing all time points of all subjects) and the second column is fsid-base containing the within-subject template (=base) name, to group time points within subject. You can have many more columns such as gender, age, group, etc. Make sure there is a column containing an accurate time variable (optimally measured in years if you are interested in annualized change) such as age or the duration of the study (time from inital scan). Here we use years to measure the time from baseline scan (=tp1). You can see in the table that the two subjects OAS2_0001 and OAS2_0004 each have two time points (MR1, MR2) that are not equally spaced (approx 15 and 18 months apart).

Note, the fsid column contains the original subject/time point id's, not the longitudinal names. The command-line scripts know that this is a longitudinal table (because of the parameter, usually --qdec-long and existing fsid-base column) and will process the data from the longitudinal directories automatically.

For example:

long_mris_slopes --qdec ./qdec/long.qdec.table.dat --meas thickness --hemi lh --do-avg --do-rate --do-pc1 --do-spc --do-stack --do-label --time years --qcache fsaverage

is a tool for the LongitudinalTwoStageModel and expects a longitudinal QDEC table (even though the flag is only --qdec). It will automatically grab the data from the longitudinal subN_tp1.long.subNtemplate directories and compute within subject atrophy rates etc.

Also:

mris_preproc --qdec-long qdec.table.dat --target study_average --hemi lh --meas thickness --out lh.thickness.mgh

is a tool usually used for mri_glmfit. It iterates over the subjects from the QDEC table, maps them to the study_average (usually fsaverage) and stacks them into a single file. In this example it takes a longitudinal QDEC table (--qdec-long) and then takes the data from the longitudinal directories, to map and stack them and get them ready for the LinearMixedEffectsModels (usually you would do a smoothing step with mri_surf2surf after the mris_preproc is finished, see the LME description).


MartinReuter

LongitudinalStatistics (last edited 2018-07-25 12:06:32 by MorganFogarty)