Castor uses Grid Engine as a grid computing computer cluster software system. Basically, the software can be divided into two parts; the scheduler and the queuing system. In order to send a job to our cluster, users must submit their job-script via the submission program. The Scheduler will read the script, prepare resource regarding the user requirements and calculate the priority from the requirements. Jobs which require fewer resources or less run time will receive higher priority. Then the job will be scheduled queuing system and job with a high priority will be executed first.
Grid Engine can be used in two ways: command line interface (hereafter CLI) and graphic user interface (GUI). In this tutorial, you will learn how to submit your jobs via CLI, request for available resources, choose the parallel environment, delete failed or unwanted jobs, and inspect errors when the job failed. The following text is the introduction message displayed when the command qmon
is run and then crtl
+ a
is pressed.
Welcome User,
You are using Open Grid Scheduler/Grid Engine 2011.11p1 in cell ‘default’.
Grid Engine is based on code donated by Sun Microsystems. The copyright is owned by Sun Microsystems and other contributors. It has been made available to the open source community under the SISSL license.
The Open Grid Scheduler project is the maintainer of open source Grid Engine. For further information: http://gridscheduler.sourceforge.net/
Basic Job Submission
The command qsub
followed by a job subscription file is used to send a job to the cluster. The general format is as follows:
1 |
$ qsub [option] job_script [--script-args] |
The command below shows an example of job submission and the server response after successful submission.
1 2 |
$ qsub -V -b y -cwd script.sh Your job 1 ("script.sh") has been submitted. |
The script.sh
is the script containing the commands we wish to run. It can be any kind of standard shell script, e.g. bash or csh. Flags, variables with a minus (-
) sign, can be interpreted as:
-V
passes all environment variables to the job
-b y
allows the command to be a binary file
-cwd
run the job at the current working directory
The following is a qsub command which uses more optional parameters:
1 |
$ qsub -cwd -w e -N Test -pe smp 16 -o test.out -e test.err jobscript.sh |
Grid Engine has a better way to deal with commands using multiple optional parameters. Optional parameters can be added after extra comments, starting with #$
, in the script.sh
file:
1 2 3 4 5 6 7 8 |
# !/bin/bash #$ -cwd #$ -w e #$ -N Test #$ -pe smp 16 #$ -o test.out #$ -e test.err mpiexec -np $NSLOTS python3 hello.py |
Once we define the number of processes to use, it will be stored at NSLOTS
(in this case, 16). We can see the variable NSLOTS
has been called to specify the number of the process for mpiexec
to use.
Single Process Job
For task arrays, see Advance Grid Engine: Task Arrays
Sus piernas dentro del arco que establecen las tuyas con el objetivo de conseguir visita este sitio una penetración de mayor contacto que sea placentera para ambas partes, sin duda alguna el anonimato es la ventaja más considerable. Es mejor si el intervalo entre tomar el medicamento o tienen una duración de algo más de 45 minutos.
To submit a single process job, the script should be written without parallel environment (-pe
) flag:
1 2 |
# !/bin/bash program |
Another way is to use the -pe smp
flag but limit the number of process to 1:
1 2 3 |
# !/bin/bash #$ -pe smp 1 mpirun -n 1 program |
Parallel Job
Here, we provide a brief explanation to submit a single process job and a multi-process (parallel) jobs. For a full explanation see Advance Grid Engine: Job Types. To submit a job with multiple processes, it is necessary to declare a parallel environment (-pe
) flag followed by the parallel type.
Shared Memory
Shared memory job runs multiple processes which share memory together on one machine. We simply define the parallel environment flag smp
and the number of processes to use.
1 2 3 |
# !/bin/bash #$ -pe smp 12 mpirun -n $NSLOTS program |
Distributed Memory
For distributed memory, each process has its own memory and does not share with any others. A distributed memory job can run across a compute node but requires additional set up how to scatter the processes over compute nodes. Suppose we want to run a job with 16 processes which spawn 4 processes on each compute node, we use:
1 2 3 |
# !/bin/bash #$ -pe mpi4ppn 16 mpirun -n $NSLOTS program |
To see the eligible parallel environment, run qconf -sql
. To see a parallel environments description, run qconf -sq <pe-name>
.
Hybrid Shared/Distributed Memory
Hybrid parallel jobs use distributed memory for each Compute Node. In each compute-node, the processes share memory together. This is the most complex type of parallel job. First, we need to define OMP_NUM_THREADS
, the environment variable, to specify how many processes on each compute node will be spawned. Suppose we want to run a hybrid job with 16 processes each compute node gets 4 processes which share memory together, we shall write:
1 2 3 4 |
# !/bin/bash #$ -pe mpi4ppn 16 export OMP_NUM_THREADS=4 mpirun -n $NSLOTS program |
Commonly Used Flags
Here we list commonly used flags to add to the script before submitting a job in the table below. Another important flag, -l
, is used to request for any available resource on our cluster. We will explain in the next topic.
Flag | Argument | Description |
---|---|---|
-N |
name |
Name of the job |
-V |
All environment variables active within the qsub utility be exported to the context of the job. This option is activated by default. | |
-cwd |
Execute the job from the current working directory. If -cwd was not specified then the job will be run from user’s home directory. |
|
-b |
y[es]|n[o] |
Gives the user the possibility to indicate explicitly whether command should be treated as binary or script.
|
-shell |
y[es]|n[o] |
With -shell n no command shell will be executed for the job. This option only applies when -b y is also used. Without -b y , -shell n has no effect. |
-S |
path |
Specifies the interpreting shell for the job. |
-q |
wc_queue_list |
Defines or redefines a list of cluster queue, queue domains or queue instances which may be used to execute this job. |
r |
y[es]|n[o] |
Indentifies the ability of a job to be rerun or not.
|
-e |
path |
Defines or redefines the path used for the standard error stream of the job. |
-o |
path |
Defines or redefines the path used for the standard output stream of the job. |
-j |
y[es]|n[o] |
Specifies whether or not the standard error stream of the job is merged into the standard output stream.
|
-pe |
parallel_environment |
Parallel programming environment (PE) to instantiate. |
-M |
email_address |
Defines or redefines the list pf users to which the server that executes the job has to send email, if the server sends mail about the job. |
-m |
b|e|a|s|n |
Defines or redefines under which circumstances mail is to be sent to the job owner or to the users defined with the -M option described below.The option arguments have the following meaning:
Currently no mail is sent when a job is suspended. |
Other available options and detailed explanations can be read by running the command man
qsub
. You might be noticed that someone using .sge
as the file extension (or suffix) of the script and define the shell environment at the first line. This extension helps them to distinguish job submission files from other standard shell scripts.
Resource & Queue Configuration
Memory Usage
Another custom configuration is a memory requirement for a job. Users can ensure that enough memory is allocated to the job before it is deployed. By default, memory is limited to 1 gigabyte per process. We allow the user to override this option by adding the flag, -l h_vmem=<size_of_memory><unit>
:
1 2 3 4 |
# !/bin/sh #$ -pe smp 8 #$ -l h_vmem=4G mpirun -n $NSLOTS program |
This command will request a total memory of 4 gigabytes per process. Below is another example
1 2 3 4 |
# !/bin/bash #$ -pe smp 8 #$ -l h_vmem=2G mpirun -n $NSLOTS program |
This command will request a total memory of 16 gigabytes shared across 8 processes.
Querying Queue
After submitting a job, users can check the status and other information of the job by using the command, qstat
. It prints an output regarding the job details:
1 2 3 |
job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 17061 0.51214 SMDRbin2 tux r 01/04/2018 10:39:25 all.q@compute-0-5.local 32 |
The state column indicates the progress of a job. After submitting a job, the system will determine the priority and put it on the queue with a wait (qw
) state. Once the resource is available and the scheduler runs the job. Its state will be changed to running (r
). The simple states are listed as follow:
qw
pending for available resource to startt
transferring to compute node(s)r
running on the Compute node(s)
There are additional state other than above including;
d
deleting a job, see job deletion for more detailsS
queue is suspendedE
pending error, see troubleshooting for more details
We can also add the following flags to retrieve more information:
-u <username>
shows the job status of a specific user, using this flag with wildcard,-u "*"
, will show job of all users.-f
shows the full output of each compute node. If-u "*"
is added, shows full output for all user.-q <queue_name>
print information on a given queue.
To list all options, use qstat -help
. The manual for qstat can be accessed by running the command,man qstat
.
Job Modification
User can alter the resource requirement only when a job is pending in queue (qw). The command is qalter
; it has the format as follows:
1 |
qalter [option] job_task_list |
job_task_list
is job id’s (and task id’s) of jobs. For example, we have a job with the job id 17061. We want to change the required memory to 64 gigabytes and merge stdout
and stderr
stream together. We use
1 |
qalter -l h_vmem=64G -j y 17061 |
To delete the job, use command qdel
follows by job id.