Using Local Scratch for MD simulation

Contributed by Yeol Kyo Choi, Seonghoon Kim, Shasha Feng
Date: 11/2018

What is scratch space/folder?

Each compute node has its own local scratch filesystem. Users may read from and write to this using their own exclusive directory /scratch/${USER}/${SLURM_JOBID}. Scratch folder is highly recommended for

  • high input/output jobs,
  • computation that requires large amounts of temporary space
    when the computation is running. In MD simulation, scratch folder shall be used for both simulation production and analysis (high I/O). Scripts are detailed as below.

<img src="/Users/shasha/Documents/Shasha_Feng/Report/Directory-explana1.png" width="500px" />


Fig 1. Usages of different directories

{{wiki:Directory-explana1.png|Usages of different directories }}

Running simulation using scratch folder

We use this NAMD script as an example. Other simulation programs are similar in this sense so we list only the script content for OpenMM and Gromacs. File run_namd_scratch.sh is shown below.

All submit scripts are available at /share/Apps/examples/userscripts

#!/bin/csh
#SBATCH --partition=imlab-gpu
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --export=ALL
#SBATCH -t 48:00:00
#SBATCH --error=job.err
#SBATCH --output=job.out
#SBATCH --job-name=NameIt
#SBATCH --oversubscribe

module load namd/2.12-gpu
set namd2 = `which namd2`

cd $SLURM_SUBMIT_DIR
set LOCAL = /scratch/${USER}/${SLURM_JOBID}

#Set output path in NAMD config script
set infile = step7.1_production.inp
set scratch = `sed -n "/set scratch/s/;//g"p ${infile} | awk '{print $3}'`
sed -i -e "/set scratch/s|\${scratch}|\${LOCAL}|g" ${infile}

set cnt = 1
set cntmax = 100 #Please change to the needed simulation length

#For counting to which ns production has run
while (${cnt} <= ${cntmax})
   if ( ! -e step7.${cnt}_production.coor ) break
   @ cnt = $cnt + 1
end
if ( $cnt > $cntmax ) exit

set i = 0
while ( ${i} <= 16 ) #Change to maximum ns that can be run in 48h
  if ( ${cnt} != 1 ) then
   @ pcnt = ${cnt} - 1
   if ( ! -e step7.${pcnt}_production.coor ) exit
   if ( ${cnt} > ${cntmax} ) exit
   set icnt  = `sed -n "/set cnt/s/;//g"p ${infile} |awk '{print $3}'`
   set ifile = `sed -n "/set inputname/s/;//g"p ${infile} |awk '{print $3}'`
   set jfile = step7.${pcnt}_production
   sed -i -e "/set cnt/s/${icnt}/${cnt}/g" -e "/set inputname/s/${ifile}/${jfile}/g" ${infile}
  endif

  $namd2 +p $SLURM_NPROCS ${infile} | tee ${LOCAL}/step7.${cnt}_production.out > /dev/null
   cp ${LOCAL}/*${cnt}_* $SLURM_SUBMIT_DIR
   rm $SLURM_SUBMIT_DIR/*restart*
   sleep 5

   @ i = ${i} + 1
   @ cnt = ${cnt} + 1
end

#sbatch run_namd_scratch.sh
exit

Note Now in `step7.1_production.inp`, the header shall be modified to this:

structure          ../step5_assembly.xplor_ext.psf
coordinates        ../step5_assembly.namd.pdb

set temp           303.15;
set cnt            415;
set scratch        SCRATCH;

outputName         ${scratch}/step7.${cnt}_production;
set inputname      step7.200_production;
binCoordinates     $inputname.coor;    # coordinates from last run (binary)
binVelocities      $inputname.vel;     # velocities from last run (binary)
extendedSystem     $inputname.xsc;     # cell dimensions from last run (binary)


Explanation

  • cd ${SLURM_SUBMIT_DIR} changes the directory to submit directory, which means where 'run_namd_scratch.sh' file is executed ('sbatched').
  • set LOCAL = /scratch/${USER}/${SLURM_JOBID} This sets the path for scratch folder. Each job has its separate folder. You can also check the files there by doing following steps, once the simulation is going on.
[shf317@sol namd]$ squeue -u shf317
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            806121 imlab-gpu   AR3_50   shf317  R       0:06      1 sol-b512
[shf317@sol namd]$ ssh sol-b512
[shf317@sol-b512 ~]$ cd /scratch/shf317/806121
[shf317@sol-b512 806121]$ ls
step5_assembly.namd.str  step7.407_production.coor  step7.407_production.xsc  toppar
step7.1_production.inp   step7.407_production.vel   step7.408_production.out
  • Modify step7.1_production.inp to direct the output to scratch folder $LOCAL.
set infile = step7.1_production.inp
set scratch = `sed -n "/set scratch/s/;//g"p ${infile} | awk '{print $3}'`
sed -i -e "/set scratch/s|\${scratch}|\${LOCAL}|g" ${infile}

This block uses sed and awk commands to modify a line in step7.1_production.inp. Change it from set scratch SCRATCH; to set scratch /scratch/shf317/806121;

  • Count how many ns the production has run so far. NAMD program produces a ".coor" file only after the job (usually 1ns) is successfully finished. We use the existence of ".coor" file as a flag for job status.
while (${cnt} <= ${cntmax})
   if ( ! -e step7.${cnt}_production.coor ) break
   @ cnt = $cnt + 1
end
if ( $cnt > $cntmax ) exit
  • Run NAMD jobs and copy data back:
$namd2 +p $SLURM_NPROCS ${infile} | tee ${LOCAL}/step7.${cnt}_production.out > /dev/null
cp ${LOCAL}/*${cnt}_* $SLURM_SUBMIT_DIR
rm $SLURM_SUBMIT_DIR/*restart*
sleep 5

The tee command is used to redirect the output. This is first put the output in memory and then write them into file. It shall be faster than $namd2 +p $SLURM_NPROCS ${infile} > ${LOCAL}/step7.${cnt}_production.out. Here ${cnt}_specifies step7.${cnt}_production files to be copied back to local folder $SLURM_SUBMIT_DIR. Restart files are deleted, because we do not need to use them anymore. sleep 5 command makes sure data copy finished successfully.

  • sbatch run_namd_scratch.sh You can un-annotate this line to make the program able to submit job automatically after the current run finishes.
    • For `run_openmm_scratch` script, please see Appendix I.
    • For `run_gromacs_scratch` script, please see Appendix II.

There are 2 schemes to use scratch folder, depending on whether we copy all the input files to scratch folder or not. For OpenMM and NAMD, we use scheme 1. For Gromacs, we use scheme 2.

<img src="/Users/shasha/Documents/Shasha_Feng/Report/Scheme1.png" width="600px"/>

Fig 1. Scheme 1 keep all the input files at local work directory, only put output in scratch folder. Because the program read those input files only once, that is not heavy job for the nodes.

<img src="/Users/shasha/Documents/Shasha_Feng/Report/Scheme2.png" width="550px"/>

Fig 2. Scheme 2 copies all the input files to scratch folder and run the job there, later copy output back. For every ns simulation to start, the preceedent ns .coor, .vel, .xsc need to be copied to scratch folder.


Analyze trajectory using scratch folder

File hy-thick.sh runs the hydrophobic thickness analysis.

#!/bin/csh
#SBATCH --job-name=NameIt
#SBATCH --partition=lts
#SBATCH --qos=nogpu
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --export=ALL
#SBATCH --output=job.out
#SBATCH --error=job.err
#SBATCH --time=1:00:00

set charmm = "/home/shf317/bin/charmm"
set LOCAL = /scratch/${USER}/${SLURM_JOBID}

${charmm} sdir=${LOCAL} < hydro_thick.inp > /dev/null
cp ${LOCAL}/* $SLURM_SUBMIT_DIR
sleep 5

exit
  1. cp ${LOCAL}/* $SLURM_SUBMIT_DIR copy the analysis result data back to local folder.

  2. The charmm analysis script hydro_thick.inp file only writes output in scratch folder. The toppar, psf files, as well as dcd files are still in SOL local nodes. We do not copy them to scratch folder, because they are read only once.

open write unit 51 card name @scrdir/hydro_thick.plo

The value for scrdir is passed through hy-thick.sh, as

${charmm} sdir=${LOCAL} < hydro_thick.inp > /dev/null


DCD frequency configuration & manipulation

Upcoming simulations

In simulation configuration script, change output frequency of production/simulation to every 1ps, since too frequent output is not useful but takes up lots of disk space. A NAMD example is listed below. In step7.1_production.inp, it should be set:

restartfreq        5000;                # 500 steps = every 1ps
dcdfreq            5000;
dcdUnitCell        yes;
xstFreq            5000;
outputEnergies     5000;
outputTiming       5000;

Orginal script downloaded from CHARMM-GUI does much more frequent output:

restartfreq        500;                # 500 steps = every 1ps
dcdfreq           1000;
dcdUnitCell        yes;
xstFreq            500;
outputEnergies     125;
outputTiming       500;

Existing simulations

Two nice packages are useful: CatDCD and DumpDCD.

CatDCD 4.0
   catdcd -o outputfile [-otype <filetype>] [-i indexfile]
      [-stype <filetype>] [-s structurefile]
      [-first firstframe] [-last lastframe] [-stride stride]
      [-<filetype>] inputfile1 [-<filetype>] inputfile2 ...

Usage:

catdcd -o step7.1.dcd -first 5 -stride 5 step7.1_production.dcd


The above line converts NAMD dcd file to fewer frames. -o step7.1.dcd specifies output file, -first 5 says the first frame in new dcd to be the 5th in old, -stride 5 use a 5-frame stride, 500-frame dcd converted to 100-frame dcd. Finally, step7.1_production.dcd is the input dcd file.

How to check? We can use DumpDCD to check the dcd:

Before dcd frequency manipulation:

[shf317@sol namd]$ dumpdcd step7.1_production.dcd
500       #Number of frames in this file
1000      #Number of previous integration steps
1000      #Frequency (integration steps) for saving of frames
500000    #Number of integration steps in the run that created this file
...

After dcd frequency manipulation:

[shf317@sol namd]$ dumpdcd step7.1.dcd
100
0
1
100
...


Software download:

Appendix I: OpenMM

File /share/Apps/examples/userscripts/run_openmm_scratch.sh is shown below.

#!/bin/csh
#SBATCH --partition=imlab-gpu
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --export=ALL
#SBATCH -t 48:00:00

module load cuda
module load anaconda/python3

setenv OPENMM_PLUGIN_DIR /share/ceph/woi216group/shared/apps/openmm/lib/plugins
setenv PYTHONPATH /share/ceph/woi216group/shared/apps/openmm/lib/python3.5/site-packages
setenv LD_LIBRARY_PATH /share/ceph/woi216group/shared/apps/openmm/lib:$LD_LIBRARY_PATH

set LOCAL = /scratch/${USER}/${SLURM_JOBID}

# Production

set init = step5_charmm2omm
set input = step7_production

set cntmin = ${cnt}

while ( ${cntmin} <= ${cntmax} )

    cd $SLURM_SUBMIT_DIR

    @ pcnt = ${cntmin} - 1
    set istep = step7_${cntmin}
    set pstep = step7_${pcnt}

    if ( ${cntmin} == 1 ) set pstep = step6.6_equilibration

    if ( ! -e ${pstep}.rst ) exit

    python -u openmm_run.py -i ${input}.inp -t toppar.str -p ${init}.psf -c ${init}.crd -irst ${pstep}.rst -orst ${LOCAL}/${istep}.rst -odcd ${LOCAL}/${istep}.dcd | tee ${LOCAL}/${istep}.out > /dev/null
    sleep 2

    if ( ! -e ${LOCAL}/${istep}.rst ) exit

    cp ${LOCAL}/${istep}.* $SLURM_SUBMIT_DIR
    sleep 2

    @ cntmin = ${cntmin} + 1
end

Appendix II: GROMACS

File /share/Apps/examples/userscripts/run_gromacs_scratch.sh is shown below for single node jobs only

#!/bin/csh
#SBATCH --partition=eng-gpu
##SBATCH --qos=nogpu
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
##SBATCH --ntasks-per-node=12
#SBATCH --export=ALL
#SBATCH -t 48:00:00
#SBATCH --job-name=pmm50
#SBATCH --oversubscribe

set echo
set LOCAL=/scratch/${USER}/${SLURM_JOBID}

setenv CONV_RSH ssh
module purge
module load gromacs/2016.4-gpu

set cnt = 1
set cntmax = 10

set ncpu = ${ncpu}

while ( ${cnt} < ${cntmax} )

  @ pcnt = ${cnt} - 1
  if ( $cnt == 1 ) then
      #Run gromacs commands
      gmx_mpi grompp -f md.mdp -c initial.gro -p topol.top -o md_${cnt}.tpr
  else
      gmx_mpi grompp -f md.mdp -c md_${pcnt}.gro -p topol.top -o md_${cnt}.tpr
  endif

  cp md_${cnt}.tpr ${LOCAL}

  cd ${LOCAL}
  mpiexec -n $ncpu gmx_mpi mdrun -v -ntomp 1 -deffnm md_${cnt}
  sleep 5

  cp md_${cnt}.* $SLURM_SUBMIT_DIR
  sleep 5
  cd $SLURM_SUBMIT_DIR

  if ( ! -e md_${cnt}.gro ) exit
  @ cnt += 1
end