Deletions are marked like this. | Additions are marked like this. |
Line 17: | Line 17: |
* the opteron was a seychelles node (node0355), running CentOS4.8 | * the opteron was a 'seychelles' cluster node (node0355), running CentOS4.8 |
Line 19: | Line 19: |
* the 3GHz intel was a launchpad node, running Centos5 | * the 3GHz intel was a 'launchpad' cluster node, running Centos5 |
Line 25: | Line 25: |
||3GHz Intel Xeon E5472 (Core)||3.4.6||-O3 -msse2 -mfpmath=sse||NA|| hours, minutes|| | ||3GHz Intel Xeon E5472 (Core)||3.4.6||-O3 -msse2 -mfpmath=sse||NA|| 5 hours, 46 minutes|| |
Line 40: | Line 40: |
* nehalem architecture makes a huge difference (compared to amd opteron 200 series) | * nehalem architecture makes a difference (compared to amd opteron 200 series) |
mri_ca_register timing info
- tests conducted by NJS from 17-20 march 2012
- using subject 'ernie'
- using 'dev' build(s)
- commandline:
mri_ca_register \ -nobigventricles \ -T transforms/talairach.lta \ -align-after \ -mask brainmask.mgz \ norm.mgz \ /autofs/cluster/freesurfer/centos6_x86_64/dev/average/RB_all_2008-03-26.gca \ transforms/talairach.m3z
- the opteron was a 'seychelles' cluster node (node0355), running CentOS4.8
- the 2.66GHz intel was machine 'namic', running Centos6.2
- the 3GHz intel was a 'launchpad' cluster node, running Centos5
- the 3.3GHz intel was machine 'monster', which has 8 processors, running Centos6.0
processor |
gcc v |
flags |
OMP threads |
mri_ca_register runtime |
2GHz AMD Opteron 246 |
3.4.6 |
-O3 -msse2 -mfpmath=sse |
NA |
12 hours, 46 minutes |
2.66GHz Intel Xeon E5430 (Core) |
3.4.6 |
-O3 -msse2 -mfpmath=sse |
NA |
6 hours, 3 minutes |
3GHz Intel Xeon E5472 (Core) |
3.4.6 |
-O3 -msse2 -mfpmath=sse |
NA |
5 hours, 46 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
3.4.6 |
-O3 -msse2 -mfpmath=sse |
NA |
3 hours, 8 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.1.2 |
-O3 -msse2 -mfpmath=sse |
NA |
3 hours, 10 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-O3 -msse2 -mfpmath=sse |
NA |
1 hours, 56 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
1 |
1 hours, 58 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
2 |
1 hours, 14 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
3 |
0 hours, 57 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
4 |
0 hours, 50 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
5 |
0 hours, 44 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
6 |
0 hours, 41 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
7 |
0 hours, 40 minutes |
3.3GHz Intel Xeon W5590 (Nehalem) |
4.4.5 |
-fopenmp -O3 -ftree-vectorize -msse4.1 -mfpmath=sse |
8 |
0 hours, 38 minutes |
observations
- asegstatsdiff comparisons show minimal differences in results
- nehalem architecture makes a difference (compared to amd opteron 200 series)
- gcc 4.4.5 alone drops 1 hour of time
- -ftree-vectorize -msse4.1 flags does not make any difference over -msse2
- omp threads plot: