The chdb tutorial

How to use this tutorial ?

Each entry point of the tutorial is dedicated to a usecase, from the most general and simple to the most specific and complicated. To understand what happens when chdb is launched for a real problem we’ll use some very simple bash commands. You should :

read carefully the few lines before and after each code sample, to understand the point.
copy-and-paste each code sample provided here, execute it from the frontal node, and look at the created files using the unix standard tools as find, ls, less, etc.

Of course, in the real life, you’ll have to replace the toy command line launched by chdb by your code, and yo must launch chdb through an sbatch command, as usual. Please remember you cannot use the frontal nodes to launch real applications, the frontal nodes are dedicated to file edition or test runs.

Also in the real life, you should use srun to launch chdb program, not mpirun. Use mpirun to stay on the frontal, and srun to go to the nodes.

Introduction

Before starting...

You should be connected to Olympe before working on this tutorial. You must initialize your environment to be able to use the commands described here :

module load intelmpi chdb/1.0

You should work in an empty temporary directory, to be sure you’ll not remove important files while executing the exercises described in the following paragraphs :

mkdir TUTO-CHDB
cd TUTO-CHDB

Help !

You can ask chdb for help :

chdb --help

Executing the same code on several input files

How many processes for a chdb job ?

specifying the output directory

Executing your code on a subset of files

Executing your code on a hierarchy of files

Managing the errors

Generating an execution report

Improving the load balancing (1/2)

Load balancing is an important feature to be aware of, and may be to improve. You can have some information using the --report switch, as explained above.
If the work is well load balanced, all the slaves finish their job at the same time, an no cpu cycles are wasted. In the opposite, if some slaves take more time to complete as others, the faster ones have to wait for the latecomers, and you can loose a lot of cpu time.
In the following example, we have have 10 files to treat, and we launch chdb with 4 slaves. each treatment takes about 1s :

rm -r inp inp.out;
mkdir inp;
for i in $(seq 1 10); do echo "coucou $i" > inp/$i.txt; done
mpirun -n 5 chdb --in-dir inp --in-type txt \
 --command-line 'sleep 1; cat %in-dir%/%path% >%out-dir%/%basename%.out \
 2>%out-dir%/%basename%.err' \
 --report report.txt
cat report.txt

Here is the report file :

SLAVE    TIME(s)    STATUS    INPUT PATH
2    1.00965    0    2.txt
1    1.01282    0    1.txt
3    1.01392    0    10.txt
4    1.01529    0    3.txt
2    1.00846    0    4.txt
1    1.00863    0    5.txt
3    1.01006    0    6.txt
4    1.00955    0    7.txt
2    1.0082     0    8.txt
1    1.00727    0    9.txt
----------------------------------------------
SLAVE    N INP    CUMULATED TIME (s)
1    3    3.02872
2    3    3.02631
3    2    2.02398
4    2    2.02484
----------------------------------------------
AVERAGE TIME (s)        = 2.52596
STANDARD DEVIATION (s)  = 0.579144
MIN VALUE (s)           = 2.02398
MAX VALUE (s)           = 3.02872

It is easy to see that 2 slaves work during 2 seconds each, and 2 slaves work during 3 seconds. The first two slaves will have to wait for the last two. But you reserved 4 processors, so you’ll have to "pay" for 4x3=12 seconds, for only 10 useful seconds : 17% cpu time is wasted. In this very simple example, you can easily correct the problem, using 5 slaves instead of 4 :

rm -r inp.out
mpirun -n 6 chdb --in-dir inp --in-type txt --command-line 'sleep 1; cat %in-dir%/%path% >%out-dir%/%basename%.out 2>%out-dir%/%basename%.err' --report report.txt
cat report.txt

The report.txt file shows that this time you will have to "pay" for only 5x2=10 seconds.

SLAVE    TIME(s)    STATUS    INPUT PATH
2    1.01017    0    2.txt
3    1.01334    0    10.txt
1    1.01385    0    1.txt
5    1.01459    0    4.txt
4    1.01562    0    3.txt
2    1.00696    0    5.txt
1    1.0086     0    7.txt
3    1.01       0    6.txt
5    1.01138    0    8.txt
4    1.01139    0    9.txt
----------------------------------------------
SLAVE    N INP    CUMULATED TIME (s)
1    2    2.02245
2    2    2.01713
3    2    2.02334
4    2    2.02701
5    2    2.02597
----------------------------------------------
AVERAGE TIME (s)        = 2.02318
STANDARD DEVIATION (s)  = 0.00386248
MIN VALUE (s)           = 2.01713
MAX VALUE (s)           = 2.02701

Improving the load balancing (2/2) :

In the following example, we create files of different size, then we arrange for the toy code to last more time on bigger files :

rm -r inp inp.out
mkdir inp
X='X'; for i in $(seq 1 9); do echo "coucou $i" > inp/$i.txt; X="$X$X"; for j in $(seq 1 $i); do echo $X >> inp/$i.txt; done; done
mpirun -n 3 chdb --in-dir inp --in-type txt \
 --command-line \
 'sleep $(( $(cat %in-dir%/%path%|wc -c) / 100 + 1)); cat %in-dir%/%path% >%out-dir%/%basename%.out 2>%out-dir%/%basename%.err' \
 --report report.txt

The report shows that the situation is somewhat catastrophic in terms of load balancing :

SLAVE    TIME(s)    STATUS    INPUT PATH
1    2.01111    0    1.txt
2    2.01391    0    2.txt
1    2.01003    0    3.txt
2    2.01012    0    4.txt
1    3.01058    0    5.txt
2    5.00927    0    6.txt
1    11.0105    0    7.txt
2    22.01      0    8.txt
1    48.0117    0    9.txt
----------------------------------------------
SLAVE    N INP    CUMULATED TIME (s)
1    5    66.054
2    4    31.0433
----------------------------------------------
AVERAGE TIME (s)        = 48.5486
STANDARD DEVIATION (s)  = 24.7562
MIN VALUE (s)           = 31.0433
MAX VALUE (s)           = 66.054

Files 8 et 9 are treated at the end, because the files are by default distributed to the slaves in alphabetical order. Unfortunately, the last file needs a rather long time to be processed, leading to a bad load balancing.
A more clever scenario could consist of working at first with the long jobs : the file nb 9 will be treated first (by slave nb 1), letting the other files to the other slave, hence a much better load balancing :

rm -r inp.out;
mpirun -n 3 chdb --in-dir inp --in-type txt \
 --command-line \
 'sleep $(( $(cat %in-dir%/%path%|wc -c) / 100 )); \
 cat %in-dir%/%path% >%out-dir%/%basename%.out 2>%out-dir%/%basename%.err' \
 --report report.txt \
 --sort-by-size

The load balancing is dramatically improved :

SLAVE    TIME(s)    STATUS    INPUT PATH
2    22.0131    0    8.txt
2    11.01      0    7.txt
2    5.00993    0    6.txt
2    3.0096     0    5.txt
2    2.00942    0    4.txt
2    2.00944    0    3.txt
2    2.01015    0    2.txt
1    48.0111    0    9.txt
2    2.0096     0    1.txt
----------------------------------------------
SLAVE    N INP    CUMULATED TIME (s)
1    1    48.0111
2    8    49.0813
----------------------------------------------
AVERAGE TIME (s)        = 48.5462
STANDARD DEVIATION (s)  = 0.756772
MIN VALUE (s)           = 48.0111
MAX VALUE (s)           = 49.0813

NOTE - Many codes, but not all codes, take a longer time to treat bigger files than smaller ones. If this is a feature of your code, you can try using the --sort-by-size switch. but if it is not, this will be useless.

The chdb tutorial

Dans cette page :

How to use this tutorial ?

Introduction

Before starting...

Help !

Voir aussi

calcul "embarrassingly parallel": codes non mpi

calcul "embarrassingly parallel": codes mpi

Exécution hybride MPI et OpenMP

soumission de jobs avec dépendances

The chdb tutorial

Dans cette page :

How to use this tutorial ?

Introduction

Before starting...

Help !

Essential

Executing the same code on several input files

How many processes for a chdb job ?

specifying the output directory

Executing your code on a subset of files

Executing your code on a hierarchy of files

Managing the errors

Advanced

Generating an execution report

Improving the load balancing (1/2)

Improving the load balancing (2/2) :

Avoiding Input-output saturation

Launching an mpi program

Controlling the placement of an mpi code

Miscellaneous

Working with directories as input

Executing a code a predefined number of times

Checkpointing chdb

Controling the code environment (v 1.1.x)

Controlling the mpi flavour:

Declaring modules:

Defining environment variables:

Giving only the name of the variables:

Defining the variable name and value:

Declaring code snippets:

Voir aussi

calcul "embarrassingly parallel": codes non mpi

calcul "embarrassingly parallel": codes mpi

Exécution hybride MPI et OpenMP

soumission de jobs avec dépendances