Skip to the content.

Posts/Introduction to NERSC/Cori

Home Publications WorkExp Projects News Posts

source: https://docs.nersc.gov/

System Specification

System Partition Processor Clock Rate Physical Cores Per Node Threads/Core Sockets Per Node Memory Per Node
Login Intel Xeon Processor E5-2698 v3 2.3 GHz 32 2 2 515 GB
Haswell Intel Xeon Processor E5-2698 v3 2.3 GHz 32 2 2 128 GB
KNL Intel Xeon Phi Processor 7250 1.4 GHz 68 4 1 96 GB (DDR4), 16 GB (MCDRAM)
Large Memory AMD EPYC 7302 3.0 GHz 32 2 2 2 TB

Node Specifications
Login Nodes


Accessing Cori

ssh -X <user>@cori.nersc.gov

Password Prompt: Your set password + OTP

To set up OTP: follow these instructions


Data Transfer Nodes


Cori SCRATCH

Cori scratch is a Lustre file system designed for high performance temporary storage of large files. It is intended to support large I/O for jobs that are being actively computed on the Cori system.

Usage

The /global/cscratch1 file system should always be referenced using the environment variable $SCRATCH (which expands to /global/cscratch1/sd/YourUserName). The scratch file system is available from all nodes and is tuned for high performance.

Quotas

If your $SCRATCH usage exceeds your quota, you will not be able to submit batch jobs until you reduce your usage. The batch job submit filter checks the usage of /global/cscratch1.

Note that the quota on the Community File System and on Global Common is shared among all members of the project, so showquota/cfsquota will report the aggregate project usage and quota.

File system Space Inodes Purge time Consequence for Exceeding Quota
Community 20 TB 20 M - No new data can be written
Global HOME 40 GB 1 M - No new data can be written
Global common 10 GB 1 M - No new data can be written
Cori SCRATCH 20 TB 10 M 12 weeks Can’t submit batch jobs

Slurm (Running Jobs)

NERSC uses Slurm for cluster/resource management and job scheduling. Slurm is responsible for allocating resources to users, providing a framework for starting, executing and monitoring work on allocated resources and scheduling work for future execution.

Additional Resources

Submitting jobs

sbatch

sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.

When you submit the job, Slurm responds with the job’s ID, which will be used to identify this job in reports from Slurm.

$ sbatch first-job.sh
Submitted batch job 864933 

salloc

salloc is used to allocate resources for a job in real time as an interactive batch job. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.

srun

srun is used to submit a job for execution or initiate job steps in real time. A job can contain multiple job steps executing sequentially or in parallel on independent or shared resources within the job’s node allocation. This command is typically executed within a script which is submitted with sbatch or from an interactive prompt on a compute node obtained via salloc.

How to write sbatch? Link How to request for interactive node? Link


Sample scripts (Bharat)

Public Webpage

path : cd /project/projectdirs/m2467/www/bharat/
webpage : https://portal.nersc.gov/project/m2467/

If you are not able to see your files just run the following command on the terminal: chmod 755 *

Submitting a parallel job

#!/bin/bash -l
#SBATCH --qos=premium
#SBATCH --nodes=6
#SBATCH --ntasks-per-node=32
#SBATCH --time=02:00:00
#SBATCH --license=SCRATCH  #note: specify license need for the file systems your job needs, such as SCRATCH,project
#SBATCH --job-name='calculation of anomalies of GPP for CESM2 First run'
#SBATCH --account=m2467
#SBATCH --output=o.log
#SBATCH --error=e.log
#SBATCH --mail-user dk28nov@gmail.com
#SBATCH -C haswell
echo 'Hello world!'

srun -n 192 --mpi=pmi2 python calc_anomalies_ssa_mpi.py -src 0 -var pr

asking an interactive node

salloc -N 1 -C haswell -q interactive -t 04:00:00

To avoid HDF errors, type the following on the terminal of interactive node: export HDF5_USE_FILE_LOCKING=FALSE

SCRATCH

$SCRATCH or cd /global/cscratch1/sd/bharat

CMIP6 Data

/global/cfs/cdirs/m3522/cmip6/CMIP6

Project Community File System (CFS), use this to share data with other members

/global/cfs/cdirs/m2467


Backups

Snapshots

Global homes and Community use a snapshot capability to provide users a seven-day history of their directories. Every directory and sub-directory in global homes contains a “.snapshots” entry.


JupyterHub

JupyterHub provides a multi-user hub for spawning, managing, and proxying multiple instances of single-user Jupyter notebook servers. At NERSC, you authenticate to the JupyterHub instance we manage using your NERSC credentials and one-time password. Here is a link to NERSC’s JupyterHub service: https://jupyter.nersc.gov/