Módosítások

PRACE User Support

34 bájt hozzáadva, 2013. november 7., 17:32
Usage of the SLURM scheduler
For example reserving 2 nodes (48 cpu cores) at the NIIFI SC for 30 minutes gives 48 * 30 = 1440 core minutes = 24 core hours. Core hours are measured between the start and and the end of the jobs.
'''It is very important to be sure the application maximally uses the allocated resources. An emty empty or non-optimal job will consume allocated core time very fast. If the account run out of the allocated time, no new jobs can be submitted until the beginning of the next accounting period. Account limits are regenerated the beginning of each month.'''
Information about an account can be listed with the following command:
==== Example ====
After executing the command, the following table shows up for Bob. The user can access, and run jobs by using two differnt different accounts (foobar,barfoo). He can see his name marked with * in the table. He shares both accounts with alice (Account column). The consumed core hours for the users are displayed in the second row (Usage), and the consumption for the jobs ran as the account is displayed in the 4th row. The last two row defines the allocated maximum time (Account limit), and the time available for the machine (Available).
<pre>
</code>
All job jobs will be inserted into an accounting database. The properties of the completed jobs can be retrieved from this database. Detailed statistics can be viewed by using this command:
<code>
sacct -l -j JOBID
==== Example ====
There are 3 jobs in the queue. The first is an arrayjob wainting array job which is waiting for resources (PENDING). The second is an MPI job running on 4 nodes for 25 minutes now. The third is an OMP run running on one node, just staertedstarted. The NAME of the jobs can be freely given, it is advised to use short, informative names.
<pre>
==== Checking licenses ====
The licenses used and available licenses can be retrieved with this command:
<code>
==== MPI jobs ====
Using MPI jobs, the number of MPI processes running on a node is to be given (<code>#SBATCH --ntasks-per-node=</code>). The most frequent case is to provide the number of CPU cores. Paralell Parallel programs should be started by using <code>mpirun</code> command.
===== Example =====
==== CPU binding ====
Generally, the performance of MPI application can be optimized with CPU core binding. In this case, the threads of the paralel program won't be scheduled by the OS between the CPU cores, and the memory localization can be made better (less cache miss). It is advised to use memory binding. Tests can be run to define, what binding strategy gives the best performance for an our application. The following settings are valid for OpenMPI environment. Further information on binding can be retrieved with <code>--report-bindings</code> MPI option. Along with the running commands, few lines of the detailed binding information are shown. It is important, that one should not use task_binding of the scheduler!
===== Binding per CPU core =====
==== Maple Grid jobs ====
Maple can be run - similarly to OMP jobs - on one node. Maple module need to be loaded for using it. A grid server needs to be started, because Maple is working in client-server mode (<code>${MAPLE}/toolbox/Grid/bin/startserver</code>). This application needs to use liceselicense, which have to be given in the jobscript (<code>#SBATCH --licenses=maplegrid:1</code>). Starting of a Maple job is done by using
<code>${MAPLE}/toolbox/Grid/bin/joblauncher</code> code.
#SBATCH -o slurm.out
#SBATCH --licenses=maplegrid:1
 
module load maple
 
${MAPLE}/toolbox/Grid/bin/startserver
${MAPLE}/toolbox/Grid/bin/joblauncher ${MAPLE}/toolbox/Grid/samples/Simple.mpl
</pre>
214
szerkesztés

Navigációs menü