Site Tools


tcr_sri003

This is an old revision of the document!


LAMMPS version 29 Oct 20

description

LAMMPS is a classical molecular dynamics code with a focus on materials modeling. It's an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. ( https://lammps.sandia.gov/index.html ) .

software version

prepare software

To prepare LAMMPS version 29Oct20 software login to tcr.cent.uw.edu.pl.

Then open interactive session on any computing node with :

srun -n16 -N1 --pty bash -l

When interactive session is started go through LAMMPS installation process described with bellow commands. It will take about 14 minutes.

#folder for source files
mkdir -p ~/downloads/lammps_29Oct20
#folder for compiled binares
mkdir -p ~/soft/lammps_29Oct20

cd ~/downloads/lammps_29Oct20
wget https://github.com/lammps/lammps/archive/stable_29Oct2020.tar.gz
tar xvzf stable_29Oct2020.tar.gz
cd lammps-stable_29Oct2020

module load mpi/openmpi-x86_64
module load compilers/gcc-9.3.0

mkdir build
cd build
cmake3 -D CMAKE_INSTALL_PREFIX=~/soft/lammps_29Oct20 -D LAMMPS_MACHINE=mpi ../cmake
make -j${SLURM_NTASKS}
make install

remember to end interactive session with exit command.

If no errors occurred, compiled LAMMP binary is available in /home/users/${USER}/soft/lammps_29Oct20/bin .

example file

It is recommended that you test this script before you try to run your own LAMMPS problems.

The example is a Lennard-Jones melt in a 3D box. The Lennard-Jones force has a cutoff at r = 2.5 sigma, where sigma is the distance at which the interparticle potential is zero. The system includes 32,000 atoms, and is to be modeled for 1000 time steps.

Create new folder for example file :

 mkdir ~/lammps_29Oct20_example

The example is stored in the file in.lj, and reads as follows:

# 3d Lennard-Jones melt
variable x index 1
variable y index 1
variable z index 1
variable t index 20000

variable xx equal 20*$x
variable yy equal 20*$y
variable zz equal 20*$z

units         lj
atom_style    atomic

lattice       fcc 0.8442
region        box block 0 ${xx} 0 ${yy} 0 ${zz}
create_box    1 box
create_atoms  1 box
mass     1 1.0

velocity all create 1.44 87287 loop geom

pair_style    lj/cut 2.5
pair_coeff    1 1 1.0 1.0 2.5

neighbor 0.3 bin
neigh_modify  delay 0 every 20 check no

fix      1 all nve
thermo   1000
run      $t

sbatch example

Use LAMMPS 29Oct20. This description assumes that path to binaries is /home/users/${USER}/soft/lammps_29Oct20/bin and example file in.lj is under /home/users/${USER}/lammps_29Oct20_example. Use bellow lammps_29Oct20-test.sbatch file to run computation.

#!/bin/bash -l
#SBATCH --job-name="lammps example N2_n32"
#SBATCH --nodes=2                   # number of computing_nodes
#SBATCH --ntasks=32                 # number of CPU's ( 16*computing_nodes )
#SBATCH --mem-per-cpu=2G
#SBATCH --partition=short
#SBATCH --constraint=intel
#SBATCH --exclusive
#SBATCH --time=2:00:00

WORKDIR="/home/users/${USER}/lammps_29Oct20_example"
cd ${WORKDIR}

export BIN_DIR="/home/users/${USER}/soft/lammps_29Oct20/bin"
export PATH=${BIN_DIR}:$PATH
export TMP_DIR="/tmp"
   
module load mpi/openmpi-x86_64
module load compilers/gcc-9.3.0

T1=`date +%s`
  
mpirun -np ${SLURM_NTASKS} lmp_mpi -in in.lj
  
T2=`date +%s`
echo -e "stop ${T2}\t start ${T1}\t ${SLURM_NNODES}"

FIXME

FIXME

FIXME

FIXME

FIXME

performance tests

Bellow results show time of computation in function of used resources (computation scalability) for a specific computational task done with pw.x program.

Assigning a larger amount of computing nodes does not always lead to a (efficient) reduction in computing time (wall-time of the job). To find the most appropriate number of nodes for a specific type of job, it is essential to run one's own benchmarks. In general, parallel jobs should scale to at least 70% efficiency for the sake of other TCR users. One user using twice the resources to squeeze out 10% more performance may be keeping other users from working at all.

Bellow results should be consider as results of this specific computational task on this specific hardware (TCR cluster) and not overall benchmark for Quantum Espresso software suite.

nodes min [s] avg [s] median [s] max [s] efficiency [%]
1 1862 2065.75 2126 2149 100.00%
2 1105 1192.75 1157.5 1351 84.25%
3 763 770 768.5 780 81.35%
4 798 1026.25 868.5 1570 58.33%
5 571 589.75 574 640 65.22%
6 488 536 525.5 605 63.59%
7 372 502.25 414.5 808 71.51%
8 375 489.75 408.5 767 62.07%
9 324 403.75 346.5 598 63.85%
10 327 456.25 465.5 567 56.94%
12 285 429.4 375 609 54.44%
16 228 274.25 281.5 306 51.04%

*) values (min, avg, median, max, efficiency) do not include failed runs
*) efficiency as t1 / ( nodes * tn ) ( where t1 is min computation time at one node, tn is min computation time on N nodes )

example source) https://arc.vt.edu/userguide/lammps/

tcr_sri003.1608194360.txt.gz · Last modified: 2023/08/01 06:38 (external edit)