Amber
There are two installations of Amber, one which only supports running on CPUs, and one which supports running on GPUs (using CUDA). Use the module command to load the required version (note that you can not use both at the same time):
module load chem/Amber/16-foss-2018a-AmberTools-17-CUDA module load chem/Amber/16-intel-2018b-AmberTools-17-patchlevel-10-15
The following job script could be used to submit an Amber workflow to the cluster, using 1 core and 4.8GB of memory for 2 hours. The following assumes that you have defined in the script amber_cpu_example.sh an Amber workflow, e.g. minimisation and molecular dynamics:
#!/bin/bash #SBATCH --job-name=amber_cpu_example # Job name #SBATCH --account=PROJECT-ACCOUNT-2020 # Your Viking project account code #SBATCH --partition=nodes # Partition for the job #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=1 # Number of cores per task #SBATCH --mem=4800MB # Job memory request #SBATCH --time=00:02:00 # Time limit hrs:min:sec #SBATCH --output=%x.log # Standard output and error log #SBATCH --mail-type=ALL # Events to receive emails about #SBATCH --mail-user=a.user@york.ac.uk # Where to send mail module load chem/Amber/16-intel-2018b-AmberTools-17-patchlevel-10-15 ./amber_cpu_example.sh
The following job script could be used to submit an Amber workflow to the GPU partition in the cluster, using 1 core, 4.8GB of memory, and 1 GPU for 2 hours. The following assumes that you have defined in the script amber_gpu_example.sh an Amber workflow which makes use of GPUs:
#!/bin/bash #SBATCH --job-name=amber_gpu_example # Job name #SBATCH --account=PROJECT-ACCOUNT-2020 # Your Viking project account code #SBATCH --partition=gpu # Partition for the job ('gpu' for the GPU partition) #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=1 # Number of cores per task #SBATCH --mem=4800MB # Job memory request #SBATCH --gres=gpu:1 # Select 1 GPU #SBATCH --time=02:00:00 # Time limit hrs:min:sec #SBATCH --output=%x.log # Standard output and error log #SBATCH --mail-type=END,FAIL # Events to receive emails about #SBATCH --mail-user=a.user@york.ac.uk # Where to send mail module load chem/Amber/16-foss-2018a-AmberTools-17-CUDA ./amber_gpu_example.sh
Benchmarks comparing CPU to GPU
JAC_PRODUCTION_NVE - 23,558 atoms PME 4fs
-----------------------------------------
CPU code 40 cores: | ns/day = 108.47 seconds/ns = 796.56
[0] 1 x GPU: | ns/day = 1078.14 seconds/ns = 80.14
JAC_PRODUCTION_NPT - 23,558 atoms PME 4fs
-----------------------------------------
CPU code 40 cores: | ns/day = 109.23 seconds/ns = 791.01
[0] 1 x GPU: | ns/day = 975.57 seconds/ns = 88.56
JAC_PRODUCTION_NVE - 23,558 atoms PME 2fs
-----------------------------------------
CPU code 40 cores: | ns/day = 59.97 seconds/ns = 1440.68
[0] 1 x GPU: | ns/day = 556.94 seconds/ns = 155.13
JAC_PRODUCTION_NPT - 23,558 atoms PME 2fs
-----------------------------------------
CPU code 40 cores: | ns/day = 58.68 seconds/ns = 1472.50
[0] 1 x GPU: | ns/day = 511.46 seconds/ns = 168.93
FACTOR_IX_PRODUCTION_NVE - 90,906 atoms PME
-------------------------------------------
CPU code 40 cores: | ns/day = 16.06 seconds/ns = 5379.01
[0] 1 x GPU: | ns/day = 210.94 seconds/ns = 409.59
FACTOR_IX_PRODUCTION_NPT - 90,906 atoms PME
-------------------------------------------
CPU code 40 cores: | ns/day = 15.71 seconds/ns = 5501.02
[0] 1 x GPU: | ns/day = 192.73 seconds/ns = 448.30
CELLULOSE_PRODUCTION_NVE - 408,609 atoms PME
--------------------------------------------
CPU code 40 cores: | ns/day = 3.38 seconds/ns = 25546.67
[0] 1 x GPU: | ns/day = 49.44 seconds/ns = 1747.43
CELLULOSE_PRODUCTION_NPT - 408,609 atoms PME
--------------------------------------------
CPU code 40 cores: | ns/day = 3.27 seconds/ns = 26430.16
[0] 1 x GPU: | ns/day = 45.36 seconds/ns = 1904.80
STMV_PRODUCTION_NPT - 1,067,095 atoms PME
-----------------------------------------
CPU code 40 cores: | ns/day = 1.92 seconds/ns = 45103.31
[0] 1 x GPU: | ns/day = 29.11 seconds/ns = 2967.94
MYOGLOBIN_PRODUCTION - 2,492 atoms GB
-------------------------------------
CPU code 40 cores: | ns/day = 34.15 seconds/ns = 2530.12
[0] 1 x GPU: | ns/day = 844.65 seconds/ns = 102.29
NUCLEOSOME_PRODUCTION - 25,095 atoms GB
---------------------------------------
CPU code 40 cores: | ns/day = 0.48 seconds/ns = 181846.68
[0] 1 x GPU: | ns/day = 27.62 seconds/ns = 3128.65
Gaussian
The following job script can be used to submit a simple Gaussian job to the cluster, using 10 cores and 10GB of memory for 30 minutes. The following assumes that you have a correctly formatted Gaussian input deck, ethane.gjf, in the working directory for the job:
#!/bin/bash #SBATCH --job-name=gaussian_test # Job name #SBATCH --account=PROJECT-ACCOUNT-2020 # Your Viking project account code #SBATCH --partition=nodes # Partition for the job #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=10 # Number of CPU cores per task #SBATCH --mem=10gb # Job memory request #SBATCH --time=00:30:00 # Time limit hrs:min:sec #SBATCH --output=%x.log # Standard output and error log #SBATCH --mail-type=ALL # What to be notified of by email #SBATCH --mail-user=a.user@york.ac.uk # Who should be notified by email module load chem/Gaussian/G16a03 g16 ethane.gjf
As Gaussian is licensed software, you will need to be added to the Gaussian group on Viking in order to use it. If you find that you can't use Gaussian on Viking due to permission errors, please get in touch with Viking support via an email to <itsupport@york.ac.uk>.
MATLAB
Running Interactively
MATLAB can be run interactively both with and without the graphical user interface. When running MATLAB interactively, please ensure that you are doing so inside an interactive cluster session, as opposed to on the Viking login nodes. To start an interactive session, see the Viking job submission documentation.
The following demonstrates how you could run MATLAB interactively without the graphical user interface:
$ srun --ntasks=1 --mem-per-cpu=4800MB --time=00:30:00 --pty bash $ module load math/MATLAB/2018a $ matlab -nojvm -nodisplay -nosplash < M A T L A B (R) > Copyright 1984-2018 The MathWorks, Inc. R2018a (9.4.0.813654) 64-bit (glnxa64) February 23, 2018 For online documentation, see http://www.mathworks.com/support For product information, visit www.mathworks.com. >>
To run MATLAB interactively with the graphical user interface, you must first set up a virtual desktop session on the cluster. Please see the accessing Viking documentation to learn how to do that. Once you have set up and connected to your virtual desktop session on Viking, the process for running interactive, graphical MATLAB is very similar to non-graphical, but for the fact that you should use the command start-interactive-session.sh to set up your interactive job, as opposed to using srun. start-interactive-session.sh takes the same parameters as srun, so can be used in the same way, but it works around some issues with setting up interactive sessions that need graphical output:
$ start-interactive-session.sh --ntasks=1 --mem-per-cpu=4800MB --time=00:30:00 --pty bash $ module load math/MATLAB/2018a $ matlab
In your virtual desktop session, you should now see the MATLAB graphical interface.
Running in batch mode
MATLAB (2019a and newer) can also be run in batch mode, I.E non-interactively. This model of execution fits nicely with HPC systems like Viking, where work can be submitted to the scheduler to be executed.
The following job script could be used to submit a MATLAB script to the cluster, using 1 core and 4.8GB of memory for 2 hours. The following assumes that you have a MATLAB script matlab_batch_example.m either in the job's working directory, or in the MATLAB search path:
Note: Do not include the .m
extension, which is part of the matlab_batch_example.m
filename, in the job script when calling matlab -batch
command, as shown below.
#!/bin/bash #SBATCH --job-name=matlab_batch_example # Job name #SBATCH --account=PROJECT-ACCOUNT-2020 # Your Viking project account code #SBATCH --partition=nodes # Partition for the job #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=1 # Number of cores per task #SBATCH --mem=4800MB # Job memory request #SBATCH --time=00:02:00 # Time limit hrs:min:sec #SBATCH --output=%x.log # Standard output and error log #SBATCH --mail-type=ALL # Events to receive emails about #SBATCH --mail-user=a.user@york.ac.uk # Where to send mail module load math/MATLAB/2021a matlab -batch matlab_batch_example
Standalone MATLAB programs
It is possible to create standalone MATLAB programs from your MATLAB projects, and these can be run on Viking. An advantage of doing this is that when running a standalone program, MATLAB does not check out a licence from the licence server, which means somebody else who has to run MATLAB interactively will be able to do so even if your MATLAB program is running!
You can find documentation about how to create standalone MATLAB programs in the MathWorks help pages, and we recommend using mcc, the MATLAB compiler, as a straightforward way to create standalone programs.
Certain MATLAB features are not available in standalone programs, so it is worth being aware of what these are to avoid trouble when running your program. You can find a list of ineligible features here, and comprehensive documentation of supported features here.
MongoDB
When using mongodb, you have to explicitly state the location of the database, or mongod will error out. You should also specify the location of the unix socket, if used.
$ module load tools/MongoDB $ mongod --unixSocketPrefix $HOME/scratch/mongod --dbpath $HOME/scratch/mongod/db
VASP
A large VASP Job
#!/bin/bash # #SBATCH --job-name=vasp-test-big # Job name #SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=action.man@york.ac.uk # Where to send mail #SBATCH --ntasks=120 # Num mpi tasks #SBATCH --cpus-per-task=1 # Number of CPU cores per task #SBATCH --nodes=3 # Number of nodes #SBATCH --ntasks-per-node=40 # How many tasks on each node #SBATCH --ntasks-per-socket=20 # How many tasks on each CPU or socket #SBATCH --distribution=cyclic:cyclic # Distribute tasks cyclically on nodes and sockets #SBATCH --mem=128gb # Job memory request per node #SBATCH --time=2:00:00 # Time limit hrs:min:sec #SBATCH --output=logs/vasp-test-big_%j.log # Standard output and error log #SBATCH --account=ARMY-CATERING-2018 module load phys/VASP/5.4.4-intel-2019a date ulimit -s unlimited mpirun -np 120 vasp_std date
#!/bin/bash # #SBATCH --job-name=vasp-test-big # Job name #SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=action.man@york.ac.uk # Where to send mail #SBATCH --ntasks=240 # Num mpi tasks #SBATCH --cpus-per-task=1 # Number of CPU cores per task #SBATCH --nodes=6 # Number of nodes #SBATCH --ntasks-per-node=40 # How many tasks on each node #SBATCH --ntasks-per-socket=20 # How many tasks on each CPU or socket #SBATCH --distribution=cyclic:cyclic # Distribute tasks cyclically on nodes and sockets #SBATCH --mem=128gb # Job memory request per node #SBATCH --time=2:00:00 # Time limit hrs:min:sec #SBATCH --output=logs/vasp-test-big_%j.log # Standard output and error log #SBATCH --account=ARMY-CATERING-2018 module load phys/VASP/5.4.4-intel-2019a date ulimit -s unlimited mpirun -np 240 vasp_std date
VOX-FE
#!/bin/bash #SBATCH --job-name=GP-EP-I-vox-fe # Job name #SBATCH --mail-type=BEGIN,END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=my.name@york.ac.uk # Where to send mail #SBATCH --ntasks=24 # Run a single task #SBATCH --cpus-per-task=1 # Number of CPU cores per task #SBATCH --nodes=1 # Number of nodes #SBATCH --ntasks-per-node=24 # How many tasks on each node #SBATCH --ntasks-per-socket=12 # How many tasks on each CPU or socket #SBATCH --mem-per-cpu=1gb # Memory per processor #SBATCH --time=02:00:00 # Time limit hrs:min:sec #SBATCH --output=logs/GP-EP-I-vox-fe-node-%j.log # Standard output and error log #SBATCH --account=my-account-2018 # Project account echo "Running small-vox-fe on $SLURM_NTASKS CPU cores" echo "Nodes allocated to job: " $SLURM_JOB_NUM_NODES "(" $SLURM_JOB_NODELIST ")" echo cd ~/scratch/VOX-FE/models module load bio/VOX-FE/1.0-foss-2017b date mpirun -np $SLURM_NTASKS PARA_BMU Script-GP-EP-I.txt date
R
Submitting Simple R Scripts to the Cluster
The following Job Script will run the R code with the default number of CPUs and memory.
args <- commandArgs(trailingOnly = TRUE) number=as.numeric(args[1]) string=args[2] print(sprintf("R script called with arguments \'%s\' and \'%s\'", number, string))
#!/bin/bash #SBATCH --job-name=Simple-R # Job name #SBATCH --mail-type=BEGIN,END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=my.name@york.ac.uk # Where to send mail #SBATCH --time=00:02:00 # Time limit hrs:min:sec #SBATCH --output=logs/Simple-R-%j.log # Standard output and error log #SBATCH --account=my-account-2018 # Project account echo `date`: executing R script simple on host ${HOSTNAME} echo Rscript --no-save --no-restore simple.R 93 "The end of the world is not today" echo echo `date`: completed R script simple on host ${HOSTNAME}
$ sbatch simple.job Submitted batch job 2929044 $ squeue -u abs4 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2929044 nodes Simple-R abs4 PD 0:00 1 (Priority) $ more logs/Simple-R-2929044.log Tue 18 Jun 15:04:45 BST 2019: executing R script simple on host node058 [1] "R script called with arguments '93' and 'The end of the world is not today' " Tue 18 Jun 15:04:45 BST 2019: completed R script simple on host node058 $
Asking for more Cores and Memory
R jobs that require more memory can use the "mem" directive.
R scripts that make use of threading can use the "cpus-per-task" directive to ask to more cores.
The following script uses 4 cores and 24GB of memory.
#!/bin/bash #SBATCH --job-name=Simple-R # Job name #SBATCH --mail-type=BEGIN,END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=andrew.smith@york.ac.uk # Where to send mail #SBATCH --ntasks=1 # Run a single task #SBATCH --cpus-per-task=4 # Number of CPU cores per task #SBATCH --mem=24gb # Job memory request #SBATCH --time=00:05:00 # Time limit hrs:min:sec #SBATCH --output=logs/Sinc2core-%j.log # Standard output and error log #SBATCH --account=ITS-SYSTEM-2018 # Project account echo `date`: executing sinc2core R test on host ${HOSTNAME} with $SLURM_CPUS_ON_ NODE slots Rscript --no-save sinc2core.R $SLURM_CPUS_ON_NODE