MD Benchmarks for Amber, CHARMM and NAMD
Motivation for this benchmark
The benchmarking data below is intended to provide information useful
in assessing single processor performance and parallel scaling for
'typical' molecular dynamics simulations on a range of computer
architectures found in researchers laboratories and at National
Supercomputing Centers. We use this data in our ongoing developments
of Amber ad CHARMM to help identify and eliminate key performance
bottlenecks. The data assembled here represents a snapshot of
performance on three widely used molecular dynamics programs currently
available to the scientific community: CHARMM, Amber and NAMD.
While
such information can be useful in guiding researchers as to which
programs are best suited for calculations on particular hardware
platforms, other factors including flexibility, functionality and
availability of a particular program enter into consideration as
well. At the National supercomputer sites, researchers are interested
in getting the most out of their allocation and getting to their
answers in the least amount of time. Thus, consideration of factors
like system size, the likely number of nodes to be accessed in a given
calculation (often based on queuing priorities and availability of
cycles) and overall time to solution are key to choosing the best
solution, e.g., simple scaling comparisons when parallel efficiency is
low, or overall time to solution is longer than on less scalable codes
does not represent an optimal choice.
Our benchmark represents a
typical simulation of a bio-macromolecule in water, of the type and
size that is most commonly studied at the present time. We have
tabulated performance data for three major MD programs: CHARMM, AMBER,
and NAMD on a range of representative computer architectures. All
input files are provided so that the timings can be reproduced and so
that any other programs can be tested for the same MD parameters.
Description of benchmark
The Joint Amber-Charmm (JAC) benchmark considers a protein (dihydrofolate reductase, dhfr)
in an explicit water bath with cubic periodic boundary conditions. Details of system size and simulation conditions are:
- 23,558 atoms
- protein: 159 residues, 2489 atoms
- water: 7023 molecules TIP3P, 21,069 atoms
- Cubic periodic box, 62.23 Å dimension
- 9Å nonbond cutoff with 2Å buffer, i.e., list with 11Å cutoff
- 1 fs timestep, 1000 steps
- NVE ensemble (constant energy, constant volume); CHARMM was run with NVT using Nose'-Hoover
- bonds to hydrogen constrained (SHAKE)
- PME electrostatics with 64x64x64 grid
- equilibration temperature was 300K
The input files for
all three programs can be found at: TSRI ftp site.
Directions for running the benchmark.
Program information
The most recent distributed versions of the programs were used
for this benchmark in June, 2002.
- CHARMM version c29b1
- Amber version 7
- NAMD version 2.4
except on the beowulf cluster where the beta release version 2.5 was used.
Timings for this benchmark for NAMD 2.5b are also available
for some other machines (NAMD Benchmark page ).
These indicate that NAMD 2.5b
is faster than NAMD 2.4 for single processors, but has about the same
performance for larger numbers of processors.
Blue Horizon at SDSC
| processors |
CHARMM c29b1 |
AMBER 7 |
Namd 2.4 |
|     1 | 2643 | 2538 | 2986 |
|     2 | 1371 (1.9) | 1229 (2.1) | 1809 (1.7) |
|     4 |   724 (3.7) |   658 (3.9) |   930 (3.2) |
|     8 |   414 (6.4) |   375 (6.8) |   516 (5.8) |
|   16 |   255 (10.4) |   228 (11.1) |   308 (9.7) |
|   32 |   195 (13.6) |   162 (15.7) |   202 (14.8) |
|   64 |   182 (14.5) |   124 (20.5) |   124 (24.1) |
| 128 |   |   |   126 (23.7) |
|
Specifications:
|
Graphs of performance
Lemieux at PSC
| processors |
CHARMM c29b1 |
AMBER 7 |
Namd 2.4 |
|     1 | 1332 | 1020 | 1385 |
|     2 |   685 (1.9) |   460 (2.2) |   750 ((1.8) |
|     4 |   356 (3.7) |   250 (4.1) |   390 (3.6) |
|     8 |   217 (6.1) |   150 (6.8) |   198 (7.0) |
|   16 |   142 (9.4) |   100 (10.2) |   105 (13.2) |
|   32 |   108 (12.3) |     70 (14.6) |     61 (22.7) |
|   64 |     98 (13.6) |     60 (17) |     40 (34.6) |
| 128 |   104 (12.8) |   |     23 (60.2) |
|
Specifications:
|
Graphs of performance
Bluefish SGI Origin at TSRI
| processors |
CHARMM c29b1 |
AMBER 7 |
Namd 2.4 |
|   1 | 1543 | 1397 | 2320 |
|   2 |   817 (1.9) |   686 (2.0) | 1259 (1.8) |
|   4 |   436 (3.5) |   374 (3.7) |   625 (3.7) |
|   8 |   244 (6.3) |   206 (6.8) |   312 (7.4) |
| 16 |   149 (10.4) |   126 (11.1) |   163 (14.2) |
| 32 |   115 (13.4) |     93 (15.0) |     87 (26.7) |
| 64 |   114 (13.5) |     78 (17.9) |     52 (44.6) |
|
Specifications:
- SGI Origin 3800
- 500 Mhz R14K
- MPI through shared memory
- f90
- TSRI SGI home
|
Graphs of performance
Troll at The Scripps Research Institute
| processors |
CHARMM c29b1 |
AMBER 7 |
Namd 2.5 |
|   1 | 1627 | 2100 | 1799 |
|   2 |   904 (1.8) | 1130 (1.9) | 1088 (1.7) |
|   4 |   503 (3.2) |   850 (2.5) |   594 (3.0) |
|   8 |   287 (5.7) |   570 (3.7) |   326 (5.5) |
| 16 |   181 (9.0) |   |   245 (7.3) |
|
Specifications:
- PC AMD Athalon
- 1.2 MHz Palomino
- 100baseT ethernet
- pgf77
- LAM MPI
- Troll home
|