Welcome to Perimeter’s HPC system “Symmetry”

Symmetries are important in physics. Noether’s theorem states that every local symmetry of a physical system generates a conservation law. In honour of this principle, Perimeter’s HPC system is called Symmetry.

Symmetry is intended to serve the needs of Perimeter researchers, filling a gap between personal devices such as laptops and desktops, and large national sytems offered e.g. by Compute Canada. As such, each node of Symmetry is significantly more powerful than a laptop, but cannot compete with a national system such as Graham or Niagara.

(This documentation is still under construction, and should be completed within the next days. Please report errors, omissions, and suggestions for this documentation to our help desk.)

Contact and help

As usual for all technical systems at Perimeter, the main channel to report issues and ask for assistance is our help desk.

For online discussions, there is a Gitter chat room Computing at Perimeter. This chat room is not restricted to discussing Symmetry, but is for all topics related to Computational Physics at Perimeter.

System description

Hardware

Symmetry consists of:

2 head nodes, which can be used for interactive work and to submit jobs. Each head node has 40 Intel Xeon Silver cores and 200 GigaBytes of memory (RAM).
76 compute nodes, which are designed to run compute-intensive applications. Each compute node has 40 Intel Xeon Gold (Skylake) cores and 200 GigaBytes of memory (RAM).
A file server hosting a GPFS file system offering 233 TeraBytes of space.
A high-performance InfiniBand network connecting the nodes and the file server.

There are various additional bits and pieces, primarily for administration, that are mostly invisible to general users.

Available Software

Symmetry provides a wide range of software. If you need additional software, you can request this via the help desk, or you can install it into your home directory.

Pre-installed software:

Ubuntu 16.04 LTS, with many scientific and development packages (e.g. FFTW, GCC compilers, GSL, LLVM compilers, OpenBLAS, OpenMPI, and many more)
Intel Parallel Studie XE, including C, C++, Fortran compiler, OpenMP, MPI, Debugger, Profiler, etc.
Julia
Maple
Mathematica
Matlab
Python
Slurm resource manager (our queueing system)

Modules

Some of the software packages use Environment modules. This means you need to load a module before the package is available. Use module avail to see what modules are avilable, module load to load a module. There are also module list, module unload, and module help.

Python

Several Python versions are available, documented on this page.

Containers / Docker / Singularity

We provide the Singularity program for running containers on Symmetry.

Using Symmetry

Access

All researchers at Perimeter have in principle access to Symmetry. Please contact the help desk to enable this access. It is probably a good idea to enable VPN and ssh access to Perimeter at the same time. Symmetry is located behind Perimeter’s firewall, and is not directly accessible from the outside.

There are two ways to access Symmetry, the traditional command-line based way using ssh, and via a web browser and JupyterHub:

Access via ssh

To log in, use ssh USERNAME@symmetry. (Replace USERNAME with your user name.) This will ask for your Perimeter password. We recommend generating ssh keys and using an ssh key chain to allow a password-less access. (Question: Where is a good tutorial for this?)

On Linux and MacOS, ssh is pre-installed. On Windows, you might need to install a client such as PuTTY. (Question: Is there a better alternative to PuTTY?)

Access via JupyterHub

Jupyterhub is documented here.

Remote Desktop / Mathematica / Matlab

In order to run graphical desktop applications on the Symmetry head nodes, we have a VNC server set up, described here.

Running jobs

While you can run jobs interactively on the head nodes, you need to be careful when doing so: Head nodes are shared between all users on Symmetry. Do this only for tasks that do not need many resources. For example, compiling code, or brief tests of a Julia or Mathematica notebook are probably fine. If in doubt, use a compute node instead.

If you overload a head node by using too much memory or too many threads, others might suffer, and an administrator might have to stop in and abort your task. (Question: Is there a tutorial explaining how to use top to monitor one’s processes?)

Generally, we recommend running jobs on Symmetry’s compute nodes, as described below.

Slurm resource manager

To avoid conflict when accessing the compute nodes, we use the Slurm resource manager (aka “scheduler” or “queueing system”). Slurm keeps track of which compute nodes are currently used by who. If you want to use a certain number of compute nodes, you have to ask Slurm, and you might have to wait until the nodes are available before you can run your job.

The basic work flow is thus as follows:

You write a batch script (shell script) for your job. (Below are some examples.) This script defines which resources you want (e.g. “4 nodes for 7 days”), and also how to run your job.
You submit this script to Slurm via sbatch (see below for examples).
If the system is busy, your job might have to wait in the queue for some time. Slurm will try to be “fair” to all users (whatever that means). Your job’s priority is determined by several factors, including how much you have used Symmetry recently, and how many resources your job requests.
Slurm will run your job automatically (that’s what batch means). This does generally not work with notebooks (e.g. Mathematica, Jupyter). Instead, you need to write a script with a text editor (see below for examples).
After the job has finished, you examine its output that was presumably written to a file.

Using Slurm

After module load slurm, you can get an overall view of the system with sinfo. This might output a description like this:

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up 7-00:00:00      6 drain* cn[017,019-020,040,045,056]
defq*        up 7-00:00:00      1  drain cn002
defq*        up 7-00:00:00      1  alloc cn001
defq*        up 7-00:00:00     68   idle cn[003-016,018,021-039,041-044,046-055,057-076]
debugq       up    1:00:00      6 drain* cn[017,019-020,040,045,056]
debugq       up    1:00:00      1  drain cn002
debugq       up    1:00:00      1  alloc cn001
debugq       up    1:00:00     68   idle cn[003-016,018,021-039,041-044,046-055,057-076]

This means there are two partitions (queues) available, called defq (the default queue) and debugq (for debugging and short interactive jobs). defq allows jobs to run for up to 7 days, debugq for up to 1 hour. Not shown here is the fact that jobs in debugq have a much higher priority and will usually start before any jobs waiting in defq.

The description of the compute nodes is (unfortunately) repeated for both queues. Nodes in the drain state are not available; they are either reserved for administrative use (here cn002), or are unresponsive (here cn[017,019-020,040,045,056]; these nodes are presumably either being updated, or might be reporting a hardware issue). Nodes in the alloc state are currently in use, and nodes in the idle state are currently free.

The command squeue shows all jobs that are currently either waiting or running. squeue -u USERNAME (replace USERNAME with your user name) shows only your jobs.

Running jobs interactively

(JupyterHub, srun, reservations, …)

Running batch jobs

Slurm comes with extensive documentation and tutorials.

When running a job on Symmetry, you need to describe how many nodes and cores your job is requesting. Determining this correctly is not always straightforward:

First, you need to know whether your application can run across multiple nodes. Many applications cannot, because it is difficult to implement this. Usually you will know whether your application supports this. For exampe, in Mathematica you need to use remote kernels to enable this. In Fortran, C, or C++, you need to use MPI or a similar mechanism. In Julia or Python programs, you also need to explicitly support using multiple processes.
You also need to know whether your application uses multiple threads. Even if your application does not support this explicitly, it might use a library that uses multi-threading. For example, Mathematica, Julia, or Python are not multi-threaded by default, but if you use linear algebra (e.g. systems with large matrices for floating-point numbers), then they might use multiple threads. In Fortran, C, or C++, you can use OpenMP for multi-threading. (Note that OpenMP and MPI/OpenMPI are very different, despite the very similar names.)
If your application uses multiple nodes, then it most likely will use all the cores on each node efficiently. This gives the highest performance, but is the most difficult to implement.
If your application use a single node but is multi-threaded, then you should probably run only a single program on each compute node. This is the easiest case.
If your application is not multi-threaded, then it iuses only a single core. Each node of Symmetry has many cores (up to 40). It thus makes sense to run multiple copies of your application at the same time on the same node, if there is enough memory available.

Here are some examples that might be useful for a quick start:

Note: The Slurm scripts contain path names pointing into my (eschnetter’s) home directory. You need to change this to point into a directory of yours, otherwise you will not see the output.

Running Mathematica on 1 node: A multi-threaded code (using linear algebra) Mathematica script Slurm script example output
Running Mathematica on 1 node: A single-threaded code, running several independent Mathematica scripts simultaneously (e.g. a parameter scan) Mathematica script Slurm script example output
Running Julia on several nodes: A multi-processing code (using Julia’s built-in multi-processing capabilities) Julia script Slurm script example output
Running Julia on 1 node: A multi-threaded code (using linear algebra) Julia script Slurm script example output
Running a C program on several nodes: A multi-processing MPI code C code build instructions Slurm script example output
Running a C program on several nodes: Ahybrid multi-processing multi-threaded MPI/OpenMP code C code build instructions Slurm script example output
Running a C program on 1 node: A multi-threaded OpenMP code C code build instructions Slurm script example output

File systems

(home directory, GPFS)