Skip to end of metadata
Go to start of metadata

CURRENT LEAD TIMES FOR USER ACCOUNT CREATION - 1 week after submission

As you will appreciate, we still have a lot of work in configuring the cluster, installing software, and providing support for users. Because of this you may not receive an immediate response to queries, and there may be a significant delay.

To submit any problems or issues you are having with Viking, please use the form:

https://goo.gl/sxbXrF

Viking is operating in a test and development phase. There are no guarantees and service is subject to disruption with little, or no notice.


Introduction

Viking is a large Linux compute cluster aimed at users who require a platform for development and the execution of small or large compute jobs.

Instructions on accessing the system from Linux and Windows are here Accessing the Servers.

What is a cluster

A cluster consists of many (100's - 1,000's) rack mounted computers call nodes.It is accessed via login nodes.

Why use a cluster

  • you don not want to tie up your own computer for many hours or days
  • you want to run many programs (jobs) at the same time
  • you want to use parallelism to obtain your results more quickly
  • you need to access more resources (memory, disc space) than is available on your own computer

Limitations of a cluster

Cluster are not the answer to all research computing problems. Some of the limitations are:

  • they can not run Microsoft Windows programs
  • they are not suitable for persistent services (web servers, databases)
  • jobs that run for many months require special attention 

Hardware Infrastructure (September 2018)

Cluster Configuration

  • 137 standard nodes
    • Intel Xeon 6138 20-core 2.0 GHz (40 cores)
    • 192 GB RAM
  • 33 high memory nodes
    • Intel Xeon 6138 20-core 2.0 GHz (40 cores)
    • 384 GB RAM
  • 2 large compute nodes
    • Intel Xeon 6130 16-core 2.1 GHz (64 cores)
    • 768 GB RAM
  • 1 very large compute node
    • Intel Xeon Platinum 8160 24-core 2.1GHz (96 cores)
    • 1.5 TB RAM
  • 2 GPU nodes
    • Intel Xeon 6138 20-core 2.0 GHz (40 cores)
    • 384 GB RAM
    • 4 x NVIDIA Tesla V100 32GB SXM2
  • 2 login nodes
    • Intel Xeon 6138 20-core 2.0 GHz (40 cores)
    • 192 GB RAM
  • High speed interconnect
    • Mellanox EDR 100 Gb Infiniband with 2:1 over-subscription
  • High performance filestore
    • Lustre filestore
      • 2,556 TB usable capacity
      • sustained 12 GB/sec write performance
    • 48 TB of NVME-backed burst-buffer filesystem
      • sustained 18 GB/sec write performance

Slurm Workload Manager Queues

Queue nameMaximum job timeNodes availableTotal Cores AvailableNodes Config - Cores - Memory





Ina addition to the above there are a number of departmental queues that have a range of job times some up to 80 days in length. 

Please note: you should not specify the queue in your job script, the gird engine will select the most appropriate queue for your job.

Please contact andrew.smith@york.ac.uk if you have any comments, or suggestions, on the configuration of the queues.

Software Infrastructure

Operating System - Centos Linux 7.3 (http://www.centos.org/)

Grid Engine -Slurm Workload Manger 18.08.4 (https://slurm.schedmd.com/)

Installed Software

To be updated

How the Grid Engine Works - move????








 







  • No labels