Impact de la maintenance du 11 Janvier 2022 sur la soumission des Jobs / Impact on Job Submission dur to maintenance

Nous mettons en place un dispositif spécifique, afin que vos jobs ne soient pas tués lors de la maintenance du 11 Janvier 2022.
En fonction de leur durée, certains jobs pourront être mis en attente jusqu’à la fin de la maintenance électrique.

Explications :

Pour la maintenance du 11 Janvier 2022 :

[user@olympelogin1 ~]$ $ scontrol show res
ReservationName=Maintenance_11-01-2022 StartTime=2022-01-11T07:00:00
...

Si vous lancez un job qui risque de se terminer après StartTime, votre job restera en état PENDING jusqu’à la fin de la maintenance. Il sera automatiquement lancé après la maintenance.

Mais si vous pensez que votre job est suffisamment court pour se terminer AVANT le début de la réservation, vous pouvez ajuster convenablement sa durée en ajoutant l’option --time dans vos en-têtes sbatch.

AIDE

Pour vous aider à calculer le temps disponible avant la prochaine réservation vous pouvez utiliser la commande check-timelimit.sh :

[user@olympelogin1 ~]$  check-timelimit.sh 
**************************************************************************

               MAINTENANCE RESERVATION ACTIVE !

    Reservation : Maintenance_11-01-2022 will start at  2022-01-11T07:00:00

    Remaining time : 10 days 15 hours 48 minutes and 8 seconds 

    If you think your job will end before the reservation starts
    you can adjust its duration with --time option in your sbatch headers 

    Max value for --time option (slurm format) : 10-15:48:00

    Additional information is available here : 
    https://www.calmip.univ-toulouse.fr/spip.php?article782

**************************************************************************

Par exemple :

Je sais qu’une maintenance est prévue bientôt.
Je vérifie combien de temps il reste avec check-timelimit.sh, qui me renvoie 2-10:34:00 (2 jours, 10 heures, 34 mn avant l’arrêt)
Si mon job est trop long, j’attends la fin de la maintenance
Supposons que mon job soit prévu pour une vingtaine d’heures : pour avoir une chance de passer avant, J’introduis dans mon script sbatch la ligne suivante --time=01-00:00:00

Attention ! Dans tous les cas la valeur de durée ne peut dépasser les limites de WallTime imposées par les files d’attente : [https://www.calmip.univ-toulouse.fr/spip.php?article608]

In view of the Olympe shutdown on April 12 and 13, in order to avoid to kill jobs in progress during the shutdown, we are setting up a specific slurm reservation. This reservation may impact the submission of your jobs until the shutdown.

This reservation may impact the submission of your jobs until the shutdown.

EXPLANATIONS

Due to maintenance January, 11th 2022 :

[user@olympelogin1 ~]$ $ scontrol show res
ReservationName=Maintenance_11-01-2022 StartTime=2022-01-11T07:00:00
...

If you run a job that is likely to terminate after StartTime, your job will remain in the PENDING state until the maintenance ends. It will be automatically started after maintenance.

But if you think your job is short enough to end before the reservation starts, you can adjust its duration appropriately by adding the --time option in your sbatch headers.

HELP

To help you calculate the time available before the next reservation you can use the check-timelimit.sh command :

[user@olympelogin1 ~]$ check-timelimit.sh 
**************************************************************************

                MAINTENANCE RESERVATION ACTIVE !

    Reservation : Maintenance_11-01-2022 will start at  2022-01-11T07:00:00

    Remaining time : 10 days 15 hours 48 minutes and 8 seconds 

    If you think your job will end before the reservation starts
    you can adjust its duration with --time option in your sbatch headers 

    Max value for --time option (slurm format) : 10-15:48:00

    Additional information is available here : 
    https://www.calmip.univ-toulouse.fr/spip.php?article782

**************************************************************************

For example :

I know there’s a maintenance scheduled soon.
I check how much time is left with check-timelimit.sh, which returns 2-10:34:00 (2 days, 10 hours, 34 min before shutdown)
If my job is too long, I wait for maintenance to finish
Suppose my job is scheduled for about 20 hours : to have a chance of getting through before then, I insert the following line in my sbatch script --time=01-00:00:00

Warning ! In all cases the time value cannot exceed the WallTime limits imposed by the queues : [https://www.calmip.univ-toulouse.fr/spip.php?article608]

Voir aussi

La frontale de connexion

Une fois que vous êtes connecté à Olympe, vous êtes sur l’une des trois frontales de connexion : olympelogin1 ou

Impact de la maintenance du 11 Janvier 2022 sur la soumission des Jobs / Impact on Job Submission dur to maintenance

Explications :

AIDE

EXPLANATIONS

HELP

Voir aussi

La frontale de connexion

Pour lancer et suivre ses calculs sur Olympe

Organisation des files d’attente

Réservation interactive en batch

Script SLURM pour une réservation de MOINS de 18 cœurs

Script SLURM pour une réservation de PLUS de 18 cœurs

Script SLURM pour Machine à Mémoire Partagée MESCA

Script SLURM pour Application OpenMP ou Multithreadée

calcul "embarrassingly parallel": codes non mpi

calcul "embarrassingly parallel": codes mpi

L’accounting

Réservation des noeuds GPU

The chdb tutorial

Afficher ma consommation sur Olympe

Script SLURM en dépeuplé

L’outil placement

Exécution hybride MPI et OpenMP

Conteneurs Singularity

soumission de jobs avec dépendances

Obtenir des informations sur un job