direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Page Content

Login hosts

The following hosts are available for login:

     

  • cluster-a.math.tu-berlin.de

  • cluster-i.math.tu-berlin.de

  • cluster-g.math.tu-berlin.de

     

The address

     

  • cluster.math.tu-berlin.de

     

points to one of the above login hosts (currently cluster-i).

cluster-a has a AMD processor, cluster-i has a processor from Intel. For a lot of tasks and for submitting of jobs this is not relevant, but compilers can be instructed to optimize the code for the current architecture.

Please don't compute on the login hosts. Their purpose is editing and submission of jobs, and there are lots of people logged in there simultaneously (see below).

cluster-g is a node with an intel processor, which additionally has 2 Tesla C1060 GPU cards. More about this in the paragraph about the GPU cluster.

You can access the /scratch partitions of the nodes via:

 
/net/nodename.scratch/

 

that is for example

 
%  ls /net/node008.scratch/

 

Batch system

The clusters are operated excusively via the batch system, i.e. jobs are written as a job script and are then submitted with the command 'qsub'.

A small job script might be generated and submitted like this:

 
%  cat > myjob.job <<EOF

#!/bin/tcsh
#$ -cwd
#$ -N myjob
#$ -o myjob.out
#$ -j y
./myprog &lt; inputfile
EOF
% qsub myjob.job</pre>
 

Instead of 'cat' you can of course use your favourite text editor to edit job scripts.

The jobs starts when the requested ressources are available. That can be immediately or for special requests after a few days. Die user can choose to rechieve an email to be informed when the job starts or ends.

Format of a job script

Once again the last example, this time with line numbers a some more parameters:

 
1    #!/bin/tcsh

2 #$ -cwd
3 #$ -N myjob
4 #$ -o myjob.out
5 #$ -j y
6 #$ -l h_rt=86400
7 #$ -l mem_free=2G
8 #$ -m be
9 #$ -M myself@math.tu-berlin.de
9
11 ./myprog < inputfile

 

Explanation:

2) change into the directory from where the job has been submitted

3) job name

4) output file

5) 'Join'=yes, i.e. write both error messages and output into the output file

6) maximum run time for the job in seconds

7) write a mail at start and end of the job

8) mail address (please provide this !!)

We strongly advise you to provide the a run time limit and the memory requirement of your job. The scheduler will schedule short jobs first and without a memory requirement by the user the job will automaticly be terminated after 12 hours. The maximum run time limit is currently 220 hours (status of 11/2011). Longer run time limits are possible after application, but discouraged because they tend to complicate the maintainance of the cluster and job abortions due to power outage or other errors get more probable the longer the jobs run.

All job parameters can also be given as an argument to the qsub command and overwrite the parameters from the job file in case of conflicts:

 
%  qsub -N test -l h_rt=80000 -l mem_free=4G jobscript

 

You can display the available parameters by calling

 
% qconf -sc

 

Jobs that use a node exclusively

Each node has a number of slots, which correspond to the number of processor cores on that node. Usually the batch system assigns one job to each of the slots.

If you want to use a node exclusively, you can add the line

#$ -l exclusive

to your job script.

The job is then executed on a free cluster node (as soon as one is available).

Monitoring of the queue and of the cluster usage

The command qstat shows running and queued jobs:

 
%  qstat

job-ID prior name user state submit/start at queue slots ja-task-ID
-------------------------------------------------------------------------------------------------
4636 0.50500 matlab zeppo r 08/31/2006 22:19:48 all.q@node02 2
4531 0.50500 comsol pippo r 08/25/2006 18:05:38 all.q@node04 2
4632 0.50500 meinprog blump r 08/31/2006 10:12:18 all.q@node07 2
4621 0.55500 mpitest npasched qw 09/01/2006 08:44:36 10

 

Options like '-u username' or '-s r' restrict the list. See 'man qstat'.

Changing attributes and deletion of jobs

Some of the job parameters can be changed after submission of the job, partially even when the job is already running. This can be done with the command qalter. It accepts most of the parameters of qsub and sets these parameters for the given job ID.

 

Job that have a wrong setup or that should be deleted for some reason, can be deleted with the command qdel with their job ID as argument.

 

An example:

 

% qstat -u npasched
job-ID prior name user state submit/start at queue slots ja-task-ID
-------------------------------------------------------------------------------------------------
4621 0.55500 mpitest npasched qw 09/01/2006 08:44:36 10
% qalter -m a 4621
modified mail options of job 4621
% qdel 4621
npasched has deleted job 4621

Array jobs

Sometimes a particular job has to be run with a big set of different data sets. The straightforward method to just write n slightly different job scripts ab submit them to batch system becomes surely annoying after n > 3.

A more elegant method are so called job arrays, where one job script is submitted with the instruction to run in n copies. A qsub command to achieve that looks like this:
% qsub -t 10-30:2 jobscript

The command above submits the script jobscript to the batch system and generates 11 copies that are each given a so called TASK_ID in the range [10..30] with distance 2, that is 10 12 14 16 ...
In the job script the task ID is available at two places:

1. In the script header:
Here you can for example add the task ID to the name of the output file, so that each copy writes into its own file:

#$ -o job.$TASK_ID.out

2. In the script itself:

Here the task ID is available trough the environment variable $SGE_TASK_ID. It can be used by the script itself or by processes started from the script,

#!/bin/tcsh
#$ -cwd
#$ -N matlab_run
#$ -o matlab_run.$TASK_ID.out
#$ -j y
#$ -m be
#$ -M myself@math.tu-berlin.de

matlab -nosplash -nodisplay < input.$SGE_TASK_ID.m

Here an example for a matlab input that reads the task ID directly:

task_id = str2num( getenv('SGE_TASK_ID') )

x = floor( task_id / 160 )
y = task_id - x * 160

.....

More environment variables are:

  • $SGE_TASK_LAST : the last task ID
  • $SGE_TASK_STEPSIZE : the step size




Disk space

Each job gets a temporary directory where files generated by the jobs can be written to. You can read the path of this directory from the environment variable $TMPDIR in your job script. It is located at the local hard disk on the node where the job is executed. At the end of the job this directory will be deleted, so you should copy data which you need later to another directory at the end of the job.

Smaller amounts of data can be written to the home directory. There is a backup job which runs on the home directory each night. The quota for the home directory is however somwhat restrictive. While you can request a bigger quota, it will not be possible to store Gigabyte sized files there.

For larger files there is a directory /work/$USER.

Note that the data from this directory are not included in any backup!

Furthermore there is the directory /lustre. It is similar in size to the /work directory but should be faster. When a job generates large amounts of data, it can we written to this directory without generating load on the fileserver that serve the /work directory.

Note however that you should not store data on /lustre permanently.

The system administration might delete older data from /lustre from time to time.

Parallel programs with MPI

For parallel programs with MPI one should use a job script similar to the following:

#!/bin/tcsh
#$ -cwd
#$ -pe ompi* 4
#$ -N mpitest
#$ -o mpitest.out
#$ -j y
#$ -m be
#$ -M myself@math.tu-berlin.de

module add ompi-1.2.2

mpirun -np $NSLOTS myprog

The red 4 stands for the number of processors (processor cores on multicore systems). It is also possible to request a range of processors:

#$ -pe ompi* 2-8

The request above starts the job with between 2 and 8 processors, depending on how many are available. The allocated number is available in the script via through the environment variable $NSLOTS.

The "Parallel Environments" ompi* requested with -pe in the example above are some kind of arrangement of groups for the queues.

They determine how the processes are distributed on the nodes when more then one queue slot is requested. There exist quite a few of these ompi* PEs. The '*' in the request above means: Take anyone that begins with 'ompi'.

The following list shows the pattern for the names of the PEs::

name of PE
cluster
processes/nodes
mp
*
n
mpi1
*
1
mpi2
*
2
mpi4
*
4
mpi
*
fill
ompi1_1
1
1
ompi1_2
1
2
ompi1_n
1
fill
ompi2_1
2
1
ompi2_2
2
2
ompi2_4
2
4
ompi2_n
2
fill
ompi3_1
3
1
ompi3_2
3
2
ompi3_4
3
4
ompi3_n
3
fill
usw....

The following list shows the number of slots per node for each cluster:

cluster
slots per node
1
2
2
4
3
4
4
4
5
2
6
16
7
8
8
8
9
4
10
8
11
12
12
8

In the PE list above means:

n
as stated
fill
A node gets processes until its slots are filled, then the next node is filled
*
anythingl

The -pe parameter might also look like this:

#$ -pe mpi1 2-8

 

For programs compiled with OpenMPI you should only use the PEs whose name begins with ompi*.

The PEs with name 'mpi*' are for programs that use ethernet based MPI.

 

A list of host names is available in the file

$PE_HOSTFILE

Development tools

There are some additional compilers installed for the cluster and the other 64bit computers that often achieve better performance than the normal gcc versions.

 

Here is a synopsis:

Available development tools
manufacturer
name of the compiler
programming language
Installed versions
modul
GNU
gcc
C89, C991
4.3.3
g++
ISO C++ 89
4.3.3
g77
Fortran77
4.3.3
gfortran
Fortran77, Fortran90
4.3.3
Intel
ifort
Fortran77, Fortran901
9.0.25, 9.1.36, 10.1.018, 11.0.069, 11.1.064
ifc-*
icc
C89, C901
9.0.23, 9.1.42, 10.1.018, 11.0.069, 11.1.064
icc-*
icpc
ISO C++ 89
9.0.23, 9.1.42, 10.1.018, 11.0.069, 11.1.064
icc-*
PathScale2)
pathCC
ISO C++ 89
2.0, 2.1, 2.2.1, 2.3, 2.3.1, 2.4, 2.5, 3.0, 3.1, 3.2
pathscale-*
pathcc
ISO C89, C99
pathf77
Fortran77
pathf90
Fortran90
path95
Fortran95
Portland
pgcc
C89
8.0-2, 8.0-6, 9.0-1, 10.0
pgi-*
pgCC
ISO C++ 89
pgf77
Fortran77
pgf90
Fortran90
pgf95
Fortran95


1partially

2There are no licences available for the pathscale compilers anymore, but the run time environment can still be used.

Not all compilers are available in the standard search path. If you need a special version, you can set the respective environment variables with the module command. An example for the intel compiler:

%  module add icc11.0.069

You can see the name of the modul from the table above and complete it with the version number. With module avail you get a list of all modules available:

%  module avail

 
------------------------------- /usr/share/modules/modulefiles --------------------------------
dot ifc9.1.36 mvapich-pgi-0.9.8 use.own
g03 lahey8.0 null pgi-7.1-1
gaussian03 module-cvs pathscale-2.0 pgi-7.2-4
gcc402 module-info pathscale-2.1
gcc411 modules pathscale-2.2.1
gridengine mpich-ch-p4 pathscale-2.3
icc9.0 mpich-ch-p4mpd pathscale-2.3.1
icc9.0.23 mvapich-gcc pathscale-2.4
icc9.1 mvapich-gcc-0.9.8 pathscale-2.5
icc9.1.42 mvapich-intel-0.9.8 pgi-6.0.5
ifc9.0 mvapich-pathscale pgi-6.1
ifc9.0.25 mvapich-pathscale-0.9.8 pgi-6.2.5
ifc9.1 mvapich-pgi pgi-7.0-4

Not all modules listed are about compilers. You can get informations about a particular modul with:

%  module help pgi-6.0.5

 
----------- Module Specific Help for 'pgi-6.0.5' ------------------
 
Sets up the paths you need to use the Portland 6.0.5 compiler suite as the default

If you use a certain modul very often, you can add the respective module commands to your ~/.cshrc- or ~/.bashrc. For example:

%  tail ~/.cshrc

 
if ( ${?MODULESHOME} ) then
module load pathscale-2.4 gcc411
endif
 
#end of .cshrc

Development tool for MPI programs

To compile MPI programs you should use the MPI compiler wrapper mpicc, mpif77. These wrappers call one of the compilers mentioned above and link the correct MPI libraries. A default version of OpenMPI is already available in the search path, other versions are available via modules. The relevant modules are: 

modul
content
ompi-gcc-1.2.2
OpenMPI 1.2.2 für gcc
ompi-pgi-1.2.2
OpenMPI 1.2.2 für Portland
ompi-gcc-1.2.4
OpenMPI 1.2.4 für gcc
ompi-pgi-1.2.4
OpenMPI 1.2.4 für Portland
ompi-gcc-1.3.2
OpenMPI 1.3.2 für gcc
ompi-pgi-1.3.2
OpenMPI 1.3.2 für Portland

 

The compilation of MPI program then works like this:

% mpicc -o myprog myprog.c

% mpif90 -o myprog myprog.f

 

 

Relevant manual pages

qsub, qstat, qalter, qdel, ...
mpirun, mpicc, pathcc, pgcc ...

Contact information

The clusters are supported by

  • Norbert Paschedag, MA 368, Tel. 314 29264
  • Kai Waßmuß, MA 368, Tel.: 314 29283


For any questions regarding the cluster or problems with the usage please send a (clust_staff)

Zusatzinformationen / Extras

Quick Access:

Schnellnavigation zur Seite über Nummerneingabe

Auxiliary Functions