Acquiring a user account
For participants of the workshop, please skip to Accessing Chalawan.
The process below describes how to register for an account. For further information please contact hpc@narit.or.th.
REGISTER
Register for an account via online registration. We will review your information and activate your account within one working day. After the account is activated, you may now log in and fill in an application form.
APPLY
Fill an online application form and submit. Once completed, you should receive notification of our decision within a week and further query regarding required software and setup if any.
LOG IN
Once your account and required setup are ready, we will send you email regarding account details and how to login to Chalawan cluster.
SUBMIT
You can learn how to submit a job as well as uploading/downloading your file to/from Chalawan cluster from this page. Now you can launch your first job.
MONITOR
You can monitor Chalawan cluster status and utilization in the user-login area where you can also monitor the status of your submitted jobs.
The command-line interface
Our operating system is based on GNU/Linux. Thus, a command-line interface or command language interpreter (CLI) is the primary mean of interaction with our HPC. In case you are not familiar with the command-line interface, free-online course at Codecademy is a good place to start.
Accessing the Chalawan
For Microsoft Windows user, see Connect to the Remote Server from Microsoft Windows.
The Chalawan cluster is an isolated system which resides in the NARIT’s internal network. At the present time, we have two systems, Castor and Pollux (hereafter the computing systems).
- Castor is the old system which is assigned with the IP address 192.168.5.100. It contains 16 traditional Compute nodes suited for CPU-intensive tasks.
- Pollux is the newest one assigned with the IP address 192.168.5.105. It contains 3 GPU nodes and 3 traditional Compute node which have been refurbished from Castor.
If you are using the internet inside NARIT network you can directly connect to these systems via the Secure shell (ssh
) command.
Connection from outside NARIT network
However, if you are using the internet outside NARIT, you need to log in to the gateway machine, A.K.A. stargate, first. The gateway machine’s IP address and other information are given to you once you get the permission to access the Chalawan Cluster.
Secure shell (ssh) through an intermediate host (the gateway)
This is the easiest method that using the ProxyJump directive. If this method doesn’t work for you because you are using the very old version of ssh, please read the next section.
To use ProxyJump, you can simply add the flag -J
followed by user@gateway.ip:port
. The example below shows how to connect to Castor (don’t forget to replace gateway.ip
and port
with the given information from the email).
1 |
[user@local ~]$ ssh -J user@gateway.ip:port user@192.168.5.100 |
If this is the first time you connect to our system, the gateway will prompt you to change the given password. Please remember that the password on the gateway and the computing system is not synchronised. After the password on the gateway is changed, please continue to log in to the computing system with the given (old) password. You can then change your password on the computing system with the command passwd
.
If the command is correct, it will ask you for the gateway password and then the password. The successful connection will print an output:
1 2 |
Last login: Wed Dec 26 00:00:00 2016 from 192.168.2.245 [user@castor ~]$ |
The naive method
For an older version of ssh, this method always works for you. Log in to the gateway machine with the given port number:
1 2 3 4 5 |
[user@local ~]$ ssh -p [port] user@gateway.ip user@gateway.ip's password: Welcome to NARIT Remote Stay ethical. Stay legal. Have fun. :-) user@RemoteSSH:~$ |
After that, log in to the computing system (in this case, Castor):
1 2 3 4 |
[user@local ~]$ ssh user@192.168.5.100 Password: Last login: Wed Apr 1 10:10:10 2015 from 192.168.2.240 [user@castor ~]$ |
Transferring data
Direct connection
These commands are applicable if you are using the internet inside NARIT network. Otherwise, please read the next section.
Rsync
Rsync
is a file transfer program capable of efficient remote update
via a fast differencing algorithm. It is the recommended command for transferring files or directories.
1 |
rsync [option] source [source_option] destination |
For example,
1 |
[user@research241 ~]$ rsync -avP ./src user@192.168.5.100:~/src |
will copy a directory src
SecureCopy (scp
)
scp
is a basic command used to transfer files. For more information about scp, run man scp
or scp -h
. Note that option -r
is recursive.
1 |
scp [-r] [option] source destination |
Transferring a file from a remote machine (in this case, Castor) to the local machine.
1 |
[user@castor ~]$ scp user@192.168.5.100:/REMOTE/DIR/FILE /LOCAL/DIR |
Transferring multiple files from a remote machine to the local machine.
1 |
[user@castor ~]$ scp user@192.168.5.100:/REMOTE/DIR/{FILE1,FILE2,...} /LOCAL/DIR |
Transferring a file from the local machine to a remote machine.
1 |
[user@castor ~]$ scp /LOCAL/DIR user@192.168.5.100:/REMOTE/DIR/FILE |
Through an intermediate host
Rsync through an intermediate host
This method requires OpenSSH’s version is equal or newer than 7.3 (to check, please use the command
ssh -V
). If you are using an older version or for the full details on this method, see Rsync files viaintermediate host.
Rsync through the gateway machine is simple. You can use -J
(ProxyJump
) option.
1 |
rsync -av -e 'ssh -A -J USER@GATEWAY.ADDRESS:PORT' file destination |
For example
1 |
[user@local ~]$ rsync -avP -e 'ssh -A -J user@gateway.ip.address:port' ./src user@192.168.5.100:~/src |
will copy a directory src from local machine (your laptop or PC) through the gateway machine to the remote machine, castor, at /home/user/src. Don’t forget to replace user@gateway.ip.address:port with the actual information from our email.
SCP through an intermediate host
Likewise rsync, This method also requires OpenSSH’s version is equal or newer than 7.3 (to check, please use the command ssh -V
). If you are using an older version or for the full details on this method, see
1 |
scp -o ProxyJump=USER@GATEWAY.ADDRESS:PORT file destination |
For example
1 |
[user@local ~]$ scp -o ProxyJump=user@gateway.ip.address:port ./file user@192.168.5.100:~/ |
Using Module Environments
The Modules package provides dynamic modification of the user’s environment via modulefiles. List of all available modules are displayed; when typing the command module avail
or module av
:
1 2 3 |
[user@castor ~]$ module av ----------------------------------------- /share/apps/modulefiles ------------------------------------------ astrometry.net MUSIC ROOT |
To use a module of a library or software, type the module load
To unload a module, use the module
unload
modulefile
module load name1 name2 ....
If you would like to know which modules have been loaded, you can use module list
. Sometimes, it is convenient to remove all the loaded modules which can be done by module
purge
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[user@castor ~]$ MUSIC -bash: MUSIC: command not found [user@castor ~]$ module load MUSIC [user@castor ~]$ MUSIC __ __ __ __ ______ __ ______ /\ "-./ \ /\ \/\ \ /\ ___\ /\ \ /\ ___\ \ \ \-./\ \ \ \ \_\ \ \ \___ \ \ \ \ \ \ \____ \ \_\ \ \_\ \ \_____\ \/\_____\ \ \_\ \ \_____\ \/_/ \/_/ \/_____/ \/_____/ \/_/ \/_____/ this is music! version 1.53 [user@castor ~]$ module list Currently Loaded Modulefiles: 1) rocks-openmpi 2) MUSIC [user@castor ~]$ module unload MUSIC [user@castor ~]$ module list Currently Loaded Modulefiles: 1) rocks-openmpi [user@castor ~]$ module purge [user@castor ~]$ module list No Modulefiles Currently Loaded. |
Some module may be conflicted with the currently loaded module, so you have to module
module swap
module
--help
module -H
Enhancement of module on the Pollux
We have applied Lmod as a new environment module system. It provides with an alternative minimalist command and solves the module hierarchy problem. Let’s see the output when running a command module avail
or ml av
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
[user@pollux ~]$ ml av ------------------------------------------------------- /opt/ohpc/pub/moduledeps/gnu8-openmpi3 -------------------------------------------------------- adios/1.13.1 hypre/2.15.1 mumps/5.1.2 petsc/3.10.3 py2-mpi4py/3.0.0 scalasca/2.4 tau/2.28 boost/1.69.0 imb/2018.1 netcdf-cxx/4.3.0 phdf5/1.10.4 py2-scipy/1.2.1 scorep/4.1 trilinos/12.12.1 dimemas/5.3.4 libsharp netcdf-fortran/4.4.5 pkdgrav3/3.0.2 py3-mpi4py/3.0.0 sionlib/1.7.2 extrae/3.5.2 mfem/3.4 netcdf/4.6.2 pnetcdf/1.11.0 py3-scipy/1.2.1 slepc/3.10.2 fftw/3.3.8 mpiP/3.4.1 opencoarrays/2.2.0 ptscotch/6.0.6 scalapack/2.0.2 superlu_dist/6.1.1 ------------------------------------------------------------ /opt/ohpc/pub/moduledeps/gnu8 ------------------------------------------------------------ DTFE/1.1.1 class/2.7.2 healpix_f90/3.31 metis/5.1.0 openblas/0.3.5 py2-numpy/1.15.3 superlu/5.2.1 R/3.5.2 gsl/2.5 impi/2019.1.144 mpich/3.3 openmpi3/3.1.3 (L) py3-numpy/1.15.3 blas/3.8.0 gsl/2.6 (D) lapack/3.8.0 mvapich2/2.3 pdtoolkit/3.25 scotch/6.0.6 cgal/4.13.1 hdf5/1.10.4 likwid/4.3.3 ocr/1.0.1 plasma/2.8.0 sextractor/2.25.0 -------------------------------------------------------------- /opt/ohpc/pub/modulefiles -------------------------------------------------------------- 2LPTic/2018 anaconda/3.7 (D) cuda/10.0 gnu8/8.3.0 (L) magpie/2.1/spark/2.4 prun/1.3 (L) EasyBuild/3.8.1 autotools (L) cuda/10.1 (D) hwloc/2.0.3 mkl/2019.3 python/2.7 IDL/7.0 charliecloud/0.9.7 gnu/5.4.0 intel/19.0.1.144 ohpc (L) python/3.6 (D) N-GenIC/2003 clustershell/1.8 gnu4/4.8.5 intel/2015 (D) papi/5.6.0 singularity/3.1.0 anaconda/2.7 cmake/3.13.4 gnu7/7.3.0 llvm5/5.0.1 pmix/2.2.2 valgrind/3.14.0 Where: D: Default Module L: Module is loaded Use "module spider" to find all possible modules. Use "module keyword key1 key2 …" to search for all possible modules matching any of the "keys". |
General Modulefiles are kept in /opt/ohpc/pub/modulefiles
while OpenHPC’s Modulefiles are kept in a hierarchy of directories. Modulefiles in /opt/ohpc/pub/moduledeps/gnu8
appear only when the Modulefile gnu8 is loaded. This directory shows all the software compiled with GNU Compiler Collection (GCC) version 8.
The following directory follows the same style as well. Modulefiles in /opt/ohpc/pub/moduledeps/gnu8-openmpi3
appear only when openmpi3/3.1.3 in /opt/ohpc/pub/moduledeps/gnu8
is loaded. All software containing in this directory are compiled by GCC8 and Openmpi3.
Lmod and OpenHPC organise all Modulefiles this way. It is much easier than looking over a terminal screen trying to find some specific software compiled with a particular compiler. For the details, sees an official user guide.
Introducing a minimal style
Lmod provides with effortless alternative control for those who frequently misspell module
to such as “moduel” or “mdoule”. On Pollux, Lmod reduces module-related commands to a few characters while you can still use the old fashioned one. For example, you can either call the module list
command or just only ml
.
Similarly, the command module avail that prints all available Modulefiles, you can use ml av
instead. Note that this command doesn’t display all possible Modulefiles. Alternatively, module spider will to the job (we will explain it in the next topic). Lmod shows a default Modulefile with (D) and an in-use module with (L).
Besides, you may load a Modulefile with command ml module_a
and unload another one with ml -module_b
. It is possible to combine these command into
1 |
[user@pollux ~]@ ml module_a -module_b |
By default, Lmod always loads the ohpc (OpenHPC) module so you can access Modulefiles containing in /opt/ohpc/pub/moduledeps/
. Otherwise, if you run ml purge
, you only see the general Modulefiles.