Introduction
There are a number of different computing facilities available at the University of York. Have you found any of the following when doing research on your own computer?
Your workload is taking a long time to run (>6hrs)
It uses all your machine's resources (compute cores or memory)
You need lots of memory
You need GPUs
You are either using or producing a lot of data
- You think you could cut your job into smaller chunks and process them at the same time
- You know you want to analyse larger datasets in the future.
We have a few different machines to use when you have these problems: individual large machines known as the research and teaching servers, and the Viking compute cluster, a connected group of hundreds of machines. Here will will give you a very brief introduction on how to access these machines.
The research and teaching servers
These servers are also known as the Linux Managed Service or LMS for short. There are currently four research servers (research0, research1, research2 and research3) and two teaching servers (teaching0 and teaching1). Detailed information on the server specifications can be found here. These machines are Desktops, similar to what you may have at home or in your office, but with a large number of compute cores and memory. This means that work that your local machine is struggling with may easily be run on one of these machines. You can log on to these machines from anywhere on campus, or off campus if you use the Virtual Private Network (VPN) or SSH gateway service. Some caveats:
- They are a shared machine which means a number of users may be logged on at the same time
- They run Linux so you need a little bit of Linux command line knowledge to get started
- If you are a undergraduate you will only have access to the teaching servers
- They get rebooted on the first Tuesday of every month
Exercise 1 - Logging into the research or teaching servers.
There are different ways to login to the LMS depending on what operating system you are running. We will break down the different options here.
Using the research/teaching servers with the Linux command line interface
Once you have successfully logged into the research or teaching servers, they may look very different to what you are used too, particularly if you are used to using Windows. Please do not let this put you off. The research computing team have successfully managed to help many people use these computers who have never used the Linux command line before. It takes a bit of getting used to but the more you use it the easier and quicker it will become over time.
The command line, or shell, has been the major interface for the Unix/Linux operating system since it was first conceived in the late 1960s. The shell allows interaction with the operating system through a text based interface, rather than the graphical interface you are used to. While the graphical interface is easy to learn, and usually makes simple things easy to do, it can be hard to do complex things like operate on large numbers of files, or make different tools work together. The shell can be hard to learn, but is much more powerful and flexible than most graphical interfaces, so can be very useful for research, where we often want to try new things on large data sets.
In this tutorial, we will only scratch the surface of the shell's features, just to get you started, but we will note some further features at the end of the tutorial that you may want to look into.
The user starts the shell by logging into the computer with a userid and password:
****************************************************************************** *** THE UNIVERSITY OF YORK IT SERVICES *** *** *** *** THIS IS A PRIVATE COMPUTER *** *** UNAUTHORISED ACCESS STRICTLY PROHIBITED *** ****************************************************************************** login: user001 password: Last login: Mon Sep 8 14:12:44 2014 from gallifrey.york.ac.uk -bash-4.1$
The last line is a command prompt and it is the means by which the computer is telling you that it is ready to accept a command from you. If you do not see the prompt, the computer is probability still executing the last command you have typed. The user types commands which take the form:
program [ options ] [ arguments ]
Roughly speaking, program is the name of the program we want to run, arguments are objects we want to process (typically data files or folders), and options modify how the program will run. Options to a command are usually proceeded by a '-' or '- -' to differentiate them from arguments. The following exercise demonstrates using the echo program with a series of arguments and the ls program with or without options.
Exercise 2 - Running commands in the Linux shell
Displaying and editing the contents of files
There are a variety of different tools to help you display and edit the contents of your files. We will provide some examples below but you may find other ones which you prefer to use in the future.
Exercise 3 - Displaying the contents of files
Being a good citizen: check for others using the machine before running large jobs:
htop
Running long jobs in the background?
nohup/nice/&?
Ctrl+Z / bg?
(Or just say, if you need to run long-running jobs, use Viking…?)
Copying files and directories remotely
You may need to copy files from your machine at home to one of the research/teaching servers. There are a number of ways to do this
Exercise 4 - Copying files from your machine to the research machines
More features
This has been a very basic introduction to the command line, just to get you started. You may also want to look up the following features:
- Using pipes (|) to pass the output of one tool as input to the next, allowing you to make new tools by combining existing ones
- Redirecting input and output to files with >
- Using wildcard characters such as * to refer to many files or directories at once
- Writing scripts: saving a series of commands to a text file and then running the file as a program
- Variables and options for environment customisation
- Command-line editing
- Command history (quick access to previous commands)