Differences between revisions 5 and 6
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:


Please refer to this site/make edits here for the most updated information: https://partnershealthcare.sharepoint.com/sites/LCN/SitePages/Morpho-Optimization-Project.aspx

Parent MorphoOptimizationProject

SEGV's are usually caused by

  1. Indexing out of the bounds of an array
  2. Using an uninitialized pointer

It is good practice, but often not followed, to NULL any pointer when the object it points to is freed. For this reason it is better to use one of the following rather than just call free.

  1. freeAndNULL(&pointer)

  2. free(pointer); pointer = NULL;

  3. { type* pointer = (type*)malloc(...); ...; ...; free(pointer); }

We need but don't have such a function available, so we are writing this instead, which makes for tedious reading, or leaving dangerously dangling pointers available.

Parallel programming adds the further risk of multiple threads changing the array bounds, realloc'ing pointers, and other sharing problems.

The hardest problems to debug involve heap and other data structure corruptions that do not show up until a long time after the corruption happens, and which disappear or move between runs, when data structures are changed to add checking code, or when the code is run serially.

It is best to avoid such problems by having

A design that clearly decomposes the mesh into independent pieces. However the concepts of FACE and EDGE make this hard to do. Code executed in parallel should, where possible, partition the mesh into "mine, ours, yours" submeshes. "I" should not read "your" values, because you might be changing them. We can both read "ours", but if either of us are writing shared data, then either

  1. #pragma omp critical
  2. omp_set_lock et. al.

are needed. Fortunately omp reductions can often provide the locking implicitly.

When they occur

The first level of checking is to use Linux mcheck. ./configure --enable-mcheck is all that is needed to do this (not yet pulled into freesurfer dev branch at time of writing). Sadly mcheck is not flawless - I have seen it report problems on valid programs, and there are a few changes in the source code that work around these complaints.

The next level is to use a malloc that we have more visibility into - which is why I have created utils/mgh_malloc.c . While not as fully capable or as fast as the real malloc, it does allow us to examine every aspect of the memory allocation, and to add checks to match our needs and suspicions.

Finally, Intel Corporation's Parallel Studio Inspector might be able to detect problems that the above miss.

MorphoOptimizationProject_DebuggingSegvs (last edited 2021-09-22 09:51:31 by DevaniCordero)