Amber

There are two installations of Amber, one which only supports running on CPUs, and one which supports running on GPUs (using CUDA). Use the module command to load the required version (note that you can not use both at the same time):

Load the Amber module
module load chem/Amber/16-foss-2018a-AmberTools-17-CUDA
module load chem/Amber/16-intel-2018b-AmberTools-17-patchlevel-10-15


The following job script could be used to submit an Amber workflow to the cluster, using 1 core and 4.8GB of memory for 2 hours. The following assumes that you have defined in the script amber_cpu_example.sh an Amber workflow, e.g. minimisation and molecular dynamics:

Example CPU Amber Script
#!/bin/bash
#SBATCH --job-name=amber_cpu_example           # Job name
#SBATCH --account=PROJECT-ACCOUNT-2020         # Your Viking project account code
#SBATCH --partition=nodes                      # Partition for the job
#SBATCH --ntasks=1                             # Run a single task	
#SBATCH --cpus-per-task=1                      # Number of cores per task
#SBATCH --mem=4800MB                           # Job memory request
#SBATCH --time=00:02:00                        # Time limit hrs:min:sec
#SBATCH --output=%x.log                        # Standard output and error log
#SBATCH --mail-type=ALL                        # Events to receive emails about
#SBATCH --mail-user=a.user@york.ac.uk          # Where to send mail	

module load chem/Amber/16-intel-2018b-AmberTools-17-patchlevel-10-15
./amber_cpu_example.sh


The following job script could be used to submit an Amber workflow to the GPU partition in the cluster, using 1 core, 4.8GB of memory, and 1 GPU for 2 hours. The following assumes that you have defined in the script amber_gpu_example.sh an Amber workflow which makes use of GPUs:

Example GPU Amber Script
#!/bin/bash
#SBATCH --job-name=amber_gpu_example           # Job name
#SBATCH --account=PROJECT-ACCOUNT-2020         # Your Viking project account code
#SBATCH --partition=gpu                        # Partition for the job ('gpu' for the GPU partition)
#SBATCH --ntasks=1                             # Run a single task	
#SBATCH --cpus-per-task=1                      # Number of cores per task
#SBATCH --mem=4800MB                           # Job memory request
#SBATCH --gres=gpu:1                           # Select 1 GPU
#SBATCH --time=02:00:00                        # Time limit hrs:min:sec
#SBATCH --output=%x.log                        # Standard output and error log
#SBATCH --mail-type=END,FAIL                   # Events to receive emails about
#SBATCH --mail-user=a.user@york.ac.uk          # Where to send mail

module load chem/Amber/16-foss-2018a-AmberTools-17-CUDA
./amber_gpu_example.sh

Benchmarks comparing CPU to GPU


JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
-----------------------------------------

CPU code 40 cores: | ns/day = 108.47 seconds/ns = 796.56
[0] 1 x GPU: | ns/day = 1078.14 seconds/ns = 80.14


JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
-----------------------------------------

CPU code 40 cores: | ns/day = 109.23 seconds/ns = 791.01
[0] 1 x GPU: | ns/day = 975.57 seconds/ns = 88.56

JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
-----------------------------------------

CPU code 40 cores: | ns/day = 59.97 seconds/ns = 1440.68
[0] 1 x GPU: | ns/day = 556.94 seconds/ns = 155.13

JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
-----------------------------------------

CPU code 40 cores: | ns/day = 58.68 seconds/ns = 1472.50
[0] 1 x GPU: | ns/day = 511.46 seconds/ns = 168.93

FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
-------------------------------------------

CPU code 40 cores: | ns/day = 16.06 seconds/ns = 5379.01
[0] 1 x GPU: | ns/day = 210.94 seconds/ns = 409.59

FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
-------------------------------------------

CPU code 40 cores: | ns/day = 15.71 seconds/ns = 5501.02
[0] 1 x GPU: | ns/day = 192.73 seconds/ns = 448.30

CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
--------------------------------------------

CPU code 40 cores: | ns/day = 3.38 seconds/ns = 25546.67
[0] 1 x GPU: | ns/day = 49.44 seconds/ns = 1747.43

CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
--------------------------------------------

CPU code 40 cores: | ns/day = 3.27 seconds/ns = 26430.16
[0] 1 x GPU: | ns/day = 45.36 seconds/ns = 1904.80

STMV_PRODUCTION_NPT - 1,067,095 atoms PME
-----------------------------------------

CPU code 40 cores: | ns/day = 1.92 seconds/ns = 45103.31
[0] 1 x GPU: | ns/day = 29.11 seconds/ns = 2967.94

MYOGLOBIN_PRODUCTION - 2,492 atoms GB
-------------------------------------

CPU code 40 cores: | ns/day = 34.15 seconds/ns = 2530.12
[0] 1 x GPU: | ns/day = 844.65 seconds/ns = 102.29

NUCLEOSOME_PRODUCTION - 25,095 atoms GB
---------------------------------------

CPU code 40 cores: | ns/day = 0.48 seconds/ns = 181846.68
[0] 1 x GPU: | ns/day = 27.62 seconds/ns = 3128.65

Gaussian

The following job script can be used to submit a simple Gaussian job to the cluster, using 10 cores and 10GB of memory for 30 minutes. The following assumes that you have a correctly formatted Gaussian input deck, ethane.gjf, in the working directory for the job:

Gaussian Job Script
#!/bin/bash
#SBATCH --job-name=gaussian_test             # Job name
#SBATCH --account=PROJECT-ACCOUNT-2020       # Your Viking project account code
#SBATCH --partition=nodes                    # Partition for the job
#SBATCH --ntasks=1                           # Run a single task	
#SBATCH --cpus-per-task=10                   # Number of CPU cores per task
#SBATCH --mem=10gb                           # Job memory request
#SBATCH --time=00:30:00                      # Time limit hrs:min:sec
#SBATCH --output=%x.log                      # Standard output and error log
#SBATCH --mail-type=ALL                      # What to be notified of by email
#SBATCH --mail-user=a.user@york.ac.uk        # Who should be notified by email	

module load chem/Gaussian/G16a03
g16 ethane.gjf

As Gaussian is licensed software, you will need to be added to the Gaussian group on Viking in order to use it. If you find that you can't use Gaussian on Viking due to permission errors, please get in touch with Viking support via an email to <itsupport@york.ac.uk>.

MATLAB

Running Interactively

MATLAB can be run interactively both with and without the graphical user interface. When running MATLAB interactively, please ensure that you are doing so inside an interactive cluster session, as opposed to on the Viking login nodes. To start an interactive session, see the Viking job submission documentation.

The following demonstrates how you could run MATLAB interactively without the graphical user interface:

Starting Matlab in Command Line mode
$ srun --ntasks=1 --mem-per-cpu=4800MB --time=00:30:00 --pty bash
$ module load math/MATLAB/2018a
$ matlab -nojvm -nodisplay -nosplash

                            < M A T L A B (R) >
                  Copyright 1984-2018 The MathWorks, Inc.
                   R2018a (9.4.0.813654) 64-bit (glnxa64)
                             February 23, 2018

 
For online documentation, see http://www.mathworks.com/support
For product information, visit www.mathworks.com.
 
>>


To run MATLAB interactively with the graphical user interface, you must first set up a virtual desktop session on the cluster. Please see the accessing Viking documentation to learn how to do that. Once you have set up and connected to your virtual desktop session on Viking, the process for running interactive, graphical MATLAB is very similar to non-graphical, but for the fact that you should use the command start-interactive-session.sh to set up your interactive job, as opposed to using srunstart-interactive-session.sh takes the same parameters as srun, so can be used in the same way, but it works around some issues with setting up interactive sessions that need graphical output:

Running MATLAB interactively with GUI
$ start-interactive-session.sh --ntasks=1 --mem-per-cpu=4800MB --time=00:30:00 --pty bash
$ module load math/MATLAB/2018a
$ matlab

In your virtual desktop session, you should now see the MATLAB graphical interface.

Running in batch mode

MATLAB (2019a and newer) can also be run in batch mode, I.E non-interactively. This model of execution fits nicely with HPC systems like Viking, where work can be submitted to the scheduler to be executed.

The following job script could be used to submit a MATLAB script to the cluster, using 1 core and 4.8GB of memory for 2 hours. The following assumes that you have a MATLAB script matlab_batch_example.m either in the job's working directory, or in the MATLAB search path:


Example MATLAB batch mode script
#!/bin/bash
#SBATCH --job-name=matlab_batch_example        # Job name
#SBATCH --account=PROJECT-ACCOUNT-2020         # Your Viking project account code
#SBATCH --partition=nodes                      # Partition for the job
#SBATCH --ntasks=1                             # Run a single task 
#SBATCH --cpus-per-task=1                      # Number of cores per task
#SBATCH --mem=4800MB                           # Job memory request
#SBATCH --time=00:02:00                        # Time limit hrs:min:sec
#SBATCH --output=%x.log                        # Standard output and error log
#SBATCH --mail-type=ALL                        # Events to receive emails about
#SBATCH --mail-user=a.user@york.ac.uk          # Where to send mail
 
module load math/MATLAB/2021a
matlab -batch matlab_batch_example

Standalone MATLAB programs

It is possible to create standalone MATLAB programs from your MATLAB projects, and these can be run on Viking. An advantage of doing this is that when running a standalone program, MATLAB does not check out a licence from the licence server, which means somebody else who has to run MATLAB interactively will be able to do so even if your MATLAB program is running!

You can find documentation about how to create standalone MATLAB programs in the MathWorks help pages, and we recommend using mcc, the MATLAB compiler, as a straightforward way to create standalone programs.

Certain MATLAB features are not available in standalone programs, so it is worth being aware of what these are to avoid trouble when running your program. You can find a list of ineligible features here, and comprehensive documentation of supported features here.


MongoDB

When using mongodb, you have to explicitly state the location of the database, or mongod will error out. You should also specify the location of the unix socket, if used.

$ module load tools/MongoDB
$ mongod --unixSocketPrefix $HOME/scratch/mongod --dbpath $HOME/scratch/mongod/db


VASP

A large VASP Job

120 core VASP Job, efficiently packed onto 3 nodes
#!/bin/bash
#
#SBATCH --job-name=vasp-test-big               # Job name
#SBATCH --mail-type=ALL                        # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=action.man@york.ac.uk      # Where to send mail
#SBATCH --ntasks=120                           # Num mpi tasks 
#SBATCH --cpus-per-task=1                      # Number of CPU cores per task
#SBATCH --nodes=3                              # Number of nodes
#SBATCH --ntasks-per-node=40                   # How many tasks on each node
#SBATCH --ntasks-per-socket=20                 # How many tasks on each CPU or socket
#SBATCH --distribution=cyclic:cyclic           # Distribute tasks cyclically on nodes and sockets
#SBATCH --mem=128gb                            # Job memory request per node
#SBATCH --time=2:00:00                         # Time limit hrs:min:sec
#SBATCH --output=logs/vasp-test-big_%j.log     # Standard output and error log
#SBATCH --account=ARMY-CATERING-2018

module load phys/VASP/5.4.4-intel-2019a

date
ulimit -s unlimited
mpirun -np 120 vasp_std
date
240 core VASP Job, efficiently packed onto 6 nodes
#!/bin/bash
#
#SBATCH --job-name=vasp-test-big               # Job name
#SBATCH --mail-type=ALL                        # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=action.man@york.ac.uk      # Where to send mail
#SBATCH --ntasks=240                           # Num mpi tasks 
#SBATCH --cpus-per-task=1                      # Number of CPU cores per task
#SBATCH --nodes=6                              # Number of nodes
#SBATCH --ntasks-per-node=40                   # How many tasks on each node
#SBATCH --ntasks-per-socket=20                 # How many tasks on each CPU or socket
#SBATCH --distribution=cyclic:cyclic           # Distribute tasks cyclically on nodes and sockets
#SBATCH --mem=128gb                            # Job memory request per node
#SBATCH --time=2:00:00                         # Time limit hrs:min:sec
#SBATCH --output=logs/vasp-test-big_%j.log     # Standard output and error log
#SBATCH --account=ARMY-CATERING-2018

module load phys/VASP/5.4.4-intel-2019a

date
ulimit -s unlimited
mpirun -np 240 vasp_std
date

VOX-FE

Sample Job Script
#!/bin/bash
#SBATCH --job-name=GP-EP-I-vox-fe            # Job name
#SBATCH --mail-type=BEGIN,END,FAIL           # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=my.name@york.ac.uk       # Where to send mail
#SBATCH --ntasks=24                          # Run a single task   
#SBATCH --cpus-per-task=1                    # Number of CPU cores per task
#SBATCH --nodes=1                            # Number of nodes
#SBATCH --ntasks-per-node=24                 # How many tasks on each node
#SBATCH --ntasks-per-socket=12               # How many tasks on each CPU or socket
#SBATCH --mem-per-cpu=1gb                    # Memory per processor
#SBATCH --time=02:00:00                      # Time limit hrs:min:sec
#SBATCH --output=logs/GP-EP-I-vox-fe-node-%j.log  # Standard output and error log
#SBATCH --account=my-account-2018            # Project account
 
echo "Running small-vox-fe on $SLURM_NTASKS CPU cores"
echo "Nodes allocated to job: " $SLURM_JOB_NUM_NODES "(" $SLURM_JOB_NODELIST ")"
echo

cd ~/scratch/VOX-FE/models
module load bio/VOX-FE/1.0-foss-2017b

date
mpirun -np $SLURM_NTASKS PARA_BMU Script-GP-EP-I.txt
date

R

Submitting Simple R Scripts to the Cluster

The following Job Script will run the R code with the default number of CPUs and memory.

Example Simple R Script - simple.R
args <- commandArgs(trailingOnly = TRUE)
number=as.numeric(args[1])
string=args[2]
print(sprintf("R script called with arguments \'%s\' and \'%s\'", number, string))
Job Script to run simple.R
#!/bin/bash
#SBATCH --job-name=Simple-R                  # Job name
#SBATCH --mail-type=BEGIN,END,FAIL           # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=my.name@york.ac.uk       # Where to send mail
#SBATCH --time=00:02:00                      # Time limit hrs:min:sec
#SBATCH --output=logs/Simple-R-%j.log        # Standard output and error log
#SBATCH --account=my-account-2018            # Project account
 
echo `date`: executing R script simple on host ${HOSTNAME}
echo
Rscript --no-save --no-restore simple.R 93 "The end of the world is not today" 
echo
echo `date`: completed R script simple on host ${HOSTNAME}
Submitting the job
$ sbatch simple.job
Submitted batch job 2929044
$ squeue -u abs4
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           2929044     nodes Simple-R     abs4 PD       0:00      1 (Priority)
$ more logs/Simple-R-2929044.log 
Tue 18 Jun 15:04:45 BST 2019: executing R script simple on host node058

[1] "R script called with arguments '93' and 'The end of the world is not today'
"

Tue 18 Jun 15:04:45 BST 2019: completed R script simple on host node058
$ 

Asking for more Cores and Memory

R jobs that require more memory can use the "mem" directive.

R scripts that make use of threading can use the "cpus-per-task" directive to ask to more cores.

The following script uses 4 cores and 24GB of memory.

Asking for more memory and cores
#!/bin/bash
#SBATCH --job-name=Simple-R                  # Job name
#SBATCH --mail-type=BEGIN,END,FAIL           # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=andrew.smith@york.ac.uk  # Where to send mail
#SBATCH --ntasks=1                           # Run a single task	
#SBATCH --cpus-per-task=4                    # Number of CPU cores per task
#SBATCH --mem=24gb                           # Job memory request
#SBATCH --time=00:05:00                      # Time limit hrs:min:sec
#SBATCH --output=logs/Sinc2core-%j.log       # Standard output and error log
#SBATCH --account=ITS-SYSTEM-2018            # Project account
  
echo `date`: executing sinc2core R test on host ${HOSTNAME} with $SLURM_CPUS_ON_
NODE slots
Rscript --no-save sinc2core.R $SLURM_CPUS_ON_NODE
  • No labels