High Performance Parallel IO and post-processing @ MdlS [ONLINE]

The training will be fully remote. In order to participate, as a pre-requisite, you need a personal computer connected to the internet with a working SSH client.

The increase in computational power goes hand in hand with an increase in the size of the data to be managed, both on the input and on the output sides. IO can easily become a bottleneck for large scale architectures. Understanding of parallel file system mechanisms and parallel IO concepts enables users to efficiently use existing high level libraries like HDF5, NetCDF or SIONlib.

With the increasing performance gap between compute and storage, even the best use of IO bandwidth might not be enough. This is especially critical for checkpointing in the context of fault tolerance. This course proposes an introduction to FTI, a library that aims to give computational scientists the means to perform fast and efficient multilevel checkpointing in large scale supercomputers.

To handle the multiplication of libraries, this course will also introduce the PDI Data Interface, a solution to couple simulation codes with libraries for I/O and post-processing (including HDF5, NetCDF, SIONlib, FTI, …) based on simple annotations. PDI improves code quality with a) annotations independent of the library used, b) a choice of IO strategy at runtime in YAML and c) negligible overhead to access the full power of the underlying libraries.

Agenda:

Monday:

  • 9h30 – 11h: Storage@TGCC & Lustre filesystems (Thomas Leibovici – TGCC, CEA)
  • 11h – 12h30: parallel I/O strategies and optimization with a focus on  SIONlib (Sebastian Lührs – JSC, FZJ)
  • 14h – 14h30: parallel I/O strategies and optimization with a focus on SIONlib Contn’d (Sebastian Lührs – JSC, FZJ)
  • 14h30 – 17h: Sequential HDF5 (Matthieu Haefele – LMAP, CNRS)

Tuesday:

  • 9h30 – 12h30: Parallel HDF5 (Matthieu Haefele – LMAP, CNRS)
  • 14h – 17h: NetCDF (Olga Abramkina – MdlS/IDRIS, CNRS)

Wednesday:

  • 9h30 – 12h30: the FTI fault-tolerance library (Leonardo Bautista Gomez – BSC)
  • 14h – 17h: the PDI Data Interface (Julien Bigot – MdlS, CEA)

Instructors: Olga Abramkina (MdlS/IDRIS, CNRS), Leonardo Bautista Gomez (BSC), Julien Bigot (MdlS, CEA), Matthieu Haefele (LMAP, CNRS), Thomas Leibovici (TGCC, CEA), Sebastian Lührs (JSC, FZJ)

Learning outcomes: After this course, participants should understand the trade-offs implied by using a parallel file-system, and know how to efficiently use parallel IO libraries. Participants will also have a basic understanding and practise of FTI and PDI.

Prerequisites: Knowledge of C or Fortran programming languages, parallel programming with MPI

Inria autumn school « High Performance Numerical Simulation » @ MdlS/Inria

Inria Bordeaux is organizing a school on high performance numerical simulation. It will take place at the Inria Bordeaux research center on November 4-8, 2019.

The school will aim on simulating a physical problem, from its modeling to its implementation in an high performance computing (HPC) framework. The school will offer both plenary courses and hands-on sessions. The physical problem considered will be the harmonic wave propagation. The hands-on session will be conducted on the Federative Platform for Research in Computer Science and Mathematics (PlaFRIM) machine.

The first day will be dedicated to the modeling of the problem and its discretization using a Discontinuous Galerkin scheme. The following two days will be dedicated to linear algebra for solving large sparse systems. Background on direct, iterative and hybrid methods for sparse linear systems will be discussed. Hands-on on related parallel solvers will then be proposed. Will follow a day dedicated to advanced parallelization schemes using task-based paradigms. Hands-on with the StarPU runtime system will be proposed. A particular attention will be devoted to the use of parallel profiling and debugging tools. The school will be closed with plenary talks illustrating the usage of such a workflow in an industrial context.

To subscribe to the mailing list of the conference, send us an mail.

Tentative Program :

  • Monday afternoon:
    • Class on modeling + assembly
    • login plafrim, jupyter, …
  • Tuesday morning: Hands-on Python 1D
  • Tuesday afternoon:
    • Class on 3D modeling + assembly
    • Class 1 on linear solvers: direct solvers
    • Introduction on guix-hpc
  • Wednesday morning: Hands-on on deploying houdini on plafrim
  • Wednesday afternoon: Class 2 on linear solvers: subspace Krylov methods, DDM parallel preconditioners, hybrid direct/iterative solvers
  • Thursday morning: Hands-on solver and/or architecture (including usage of solvers over runtime systems, but not directly programming runtime systems)
  • Thursday afternoon:
    • Class on runtime systems
  • Friday morning, in parallel or on demand:
    • Parallelization of Cholesky over StarPU
    • Parallel debugging
  • Friday afternoon: Industrial applications

Link toward the registration page

Introduction to PETSc @ MdlS/Idris

The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations (www.mcs.anl.gov/petsc/).

It enables researchers to delegate the linear algebra part of their applications to a specialized team, and to test various solution methods. The course will provide the necessary basis to get started with PETSc and give an overview of its possibilities. Presentations will alternate with hands-on sessions (in C or Fortran).

Learning outcomes :

On completion of this course, the participant should
– Be able to build and solve simple PDE examples
– Use and compare different solvers on these examples
– Be familiar with using the on-line documentation
– Be able to easily explore other PETsc possibilities relevant to his/her application.

Prerequisites :

C or Fortran programming.
Notions of linear algebra, as well as notions of MPI, would be an asset.

Link toward the registration page

Introduction to ScaLAPACK and MAGMA libraries @ MdlS/Idris

The aim of this course is to introduce the basic usages of the ScaLAPACK and MAGMA libraries

ScaLAPACK :

The ScaLAPACK (Scalable Linear Algebra PACKage) is a library for high-performance dense linear algebra based on routines for distributed-memory message passing computers. It is mostly based on a subset of LAPACK (Linear Algebra PACKage) and BLAS (Basic Linear Algebra Subprograms) routines redesigned for distributed memory MIMD parallel computers where all the MPI communications are handled by routines provided by the BLACS (Basic Linear Algebra Communication Subprograms) library.

The lecture will be mostly based on how to use the PBLAS  (Parallel BLAS) and ScaLAPACK libraries for linear algebra problems in HPC:
    • General introduction about the PBLAS and ScaLAPACK libraries
    • Main ideas how to decompose the linear algebra problems in parallel programming
    • Examples of basic operations with PBLAS : vector-vector, vector-matrix and matrix-matrix operations
    • Examples of basic operations with ScaLAPACK : inversion and diagonalization
    • Main problem based on calculating an exponentiation of a matrix

MAGMA:

In the second part of the course, we present MAGMA (Matrix Algebra on GPU and Multicore Architectures) , a dense linear algebra library similar to LAPACK but for hybrid/heterogeneous architectures. We start by presenting basic concepts of GPU architecture and giving an overview of communication schemes between CPUs and GPUs. Then, we  briefly present hybrid CPU/GPU programming models using the CUDA language.  Finally, we present MAGMA and how it can be used to easily and efficiently accelerate scientific codes, particularly those already using BLAS and LAPACK.

Trainers:

    • Donfack Simplice (MAGMA)
    • Hasnaoui Karim (ScaLAPACK)

Prerequisites :

C or C++ and Fortran programming.
Notions of linear algebra, as well as notions of MPI, would be an asset.

Link toward the registration page

Performance portability for GPU application using high-level programming approaches with Kokkos @ MdlS/Idris

When developing a numerical simulation code with high performance and efficiency in mind, one is often compelled to accept a trade-off between using a native-hardware programming model (like CUDA or OpenCL), which has become tremendously challenging, and loosing some cross-platform portability.

Porting a large existing legacy code to a modern HPC platform, and developing a new simulation code, are two different tasks that may be benefit from a high-level programming model, which abstracts the low-level hardware details.

This training presents existing high-level programming solutions that can preserve at best as possible performance, maintainability and portability across the vast diversity of modern hardware architectures (multicore CPU, manycore, GPU, ARM, ..) and software development productivity.

We will  provide an introduction to the high-level C++ programming model Kokkos https://github.com/kokkos, and show basic code examples  to illustrate the following concepts through hands-on sessions:

  • hardware portability: design an algorithm once and let the Kokkos back-end (OpenMP, CUDA, …) actually derive an efficient low-level implementation;
  • efficient architecture-aware memory containers: what is a Kokkos::view;
  • revisit fundamental parallel patterns with Kokkos: parallel for, reduce, scan, … ;
  • explore some mini-applications.

Several detailed examples in C/C++/Fortran will be used in hands-on session on the high-end hardware platform Ouessant (http://www.idris.fr/ouessant/), equipped with Nvidia Pascal GPUs.

Prerequisites:
Some basic knowledge of the CUDA programming model and of C++.

Link toward the registration page

Formation Cast3M Avancée

L’objectif est d’acquérir une connaissance approfondie de la procédure de résolution PASAPAS et d’être capable de l’adapter à la problématique à résoudre.

      • Présentation de la procédure PASAPAS
      • Architecture de PASAPAS en mécanique et thermique
      • Options de contrôle algorithmique de PASAPAS
      • Les procédures utilisateurs de PASAPAS (PERSO1, CHARTHER, .)

Site et inscription pour l’événement

Optimization @ MdlS/CINES

This training will present basic elements to enable developpers to understand when and how to optimize the performance of their codes.

Optimization :

  • Compiler options
  • Vectorization – Data access (cache usage maximization)
  • Theory to upper-bound the expected performance benefit (speedup, efficiency, peak, memory bandwidth, …)
  • Examples of Stencil codes and N-body simulations

Half of the course will be made of hands-on sessions. The hands-on will use the https://github.com/aff3ct/MIPP library

Trainers

  • Bertrand Cirou
  • Mathieu Cloirec
  • Cédric Jourdain
  • Umesh Seth
  • Dorian Midou
  • Naima Alaoui

Learning outcomes 
Ability to understand the main issues for code optimization, knowledge of the main tools and techniques for basic debugging.

Prerequisites 
ID Card, ZRR access, Basic knowledge of Unix, programming experience in C++, OpenMP

Link toward the registration page

Uncertainty quantification @MdlS

Uncertainty in computer simulations, deterministic and probabilistic methods for quantifying uncertainty, OpenTurns software, Uranie software

Content
Uncertainty quantification takes into account the fact that most inputs to a simulation code are only known imperfectly. It seeks to translate this uncertainty of the data to improve the results of the simulation. This training will introduce the main methods and techniques by which this uncertainty propagation can be handled without resorting to an exhaustive exploration of the data space. HPC plays an important role in the subject, as it provides the computing power made necessary by the large number of simulations needed.
The course will present the most important theoretical tools for probability and statistical analysis, and will illustrate the concepts using the OpenTurns software.

Course Outline

Day 1 : Methodology of Uncertainty Treatment – Basics of Probability and Statistics
•    General Uncertainty Methodology (30’) : A. Dutfoy
•    Probability and Statistics: Basics (45’) : G. Blondet
•    General introduction to Open TURNS and Uranie (2 * 30’) : G. Blondet, J.B. Blanchard
•    Introduction to Python and Jupyter (45’): practical work on distributions manipulations
Lunch
•    Uncertainty Quantification (45’) : J.B. Blanchard
•    OpenTURNS – Uranie practical works: sections 1, 2 (1h): G. Blondet,  J.B. Blanchard,  A. Dutfoy
•    Central tendency and Sensitivity analysis (1h): A. Dutfoy

Day 2 : Quantification, Propagation and Ranking of Uncertainties
•    Application to OpenTURNS and Uranie (1h): section 3 M. Baudin, G. Blondet, F. Gaudier, J.B. Blanchard
•    Estimation of probability of rare events (1h): G. Blondet
•    Application to OpenTURNS and Uranie (1h): M. Baudin, G. Blondet, F. Gaudier, J.B. Blanchard
Lunch
•    Distributed computing (1h) : Uranie (15’, F. Gaudier, J.B. Blanchard), OpenTURNS (15’, G. Blondet), Salome et OpenTURNS (30’, O. Mircescu)
•    Optimisation and Calibration (1h) : J.B. Blanchard, M. Baudin
•    Application to OpenTURNS and Uranie (1h): J.B. Blanchard, M. Baudin

Day 3 : HPC aspects – Meta model
•    HPC aspects specific to the Uncertainty treatment (1h) : K. Delamotte
•    Introduction to Meta models (validation, over-fitting) – Polynomial chaos expansion (1h) : JB Blanchard, C. Mai,
•    Kriging meta model (1h): C. Mai
Lunch
•    Application to OpenTURNS and Uranie (2h) : C. Mai, G. Blondet, J.B. Blanchard
•    Discussion /  Participants projects

Learning outcomes
Learn to recognize when uncertainty quantification can bring new insight to simulations.
Know the main tools and techniques to investigate uncertainty propagation.
Gain familiarity with modern tools for actually carrying out the computations in a HPC context.

Prerequisites
Basic knowledge of probability will be useful, as will a basic familiarity with Linux.

Link toward the registration page

Introduction to machine learning in Python with Scikit-learn @ MdlS/ICM

The rapid growth of artificial intelligence and data science has made scikit-learn one of the most popular Python libraries. The tutorial will present the main components of scikit-learn, covering aspects such as standard classifiers and regressors, cross-validation, or pipeline construction, with examples from various fields of application. Hands-on sessions will focus on medical applications, such as classification for computer-aided diagnosis or regression for the prediction of clinical scores.

Learning outcomes :

Ability to solve a real-world machine learning problem with scikit-learn

Prerequisites :

  • Basic knowledge of Python (pandas, numpy)
  • Notions of machine learning
  • No prior medical knowledge is required

Link toward the registration page

Advanced Usage on Irene Joliot-Curie: parallelism, optimization, IO, tools

The aim of this course is to give users the best practices to improve their use of the newly installed Prace Irene Joliot-Curie system and to give hints to prepare their codes for future architectures.

Topics

  • Introduction:CEA/TGCC, Irene Joliot Curie supercomputer [CEA]
  • Technology: architecures, KNL/Skylake, AMD Rome, IB/BXI  [ATOS/Bull]
  • MPI Software: OpenMPI, portals, infiniBand, WI4MPI & nbsp;[EOLEN/AS+]
  • User environment: module, collections, flavor/features,toolchains, Hands’on   [EOLEN/AS+]
  • Vectorisation: openMP4, simd directives, tools, optimisation [EOLEN/AS+]
  • Virtualisation: Pcocc, checkpoint, templates, Hands’on[CEA  / EOLEN ]
  • I/O: Posix, StdC, MPI-io, hdf5, Hands’on   [EOLEN/AS+]

Prerequisites

Experience with code developpement, knowledge of C or F90, MPI, OpenMP

Lien vers le site des inscriptions

Formation Cast3M Débutants

L’objectif est d’être capable d’utiliser Cast3M pour réaliser des calculs éléments finis en thermo-mécanique, linéaire et non linéaire, stationnaire et transitoire.

      • Maillage
      • Définition de modèles et caractéristiques matériaux (linéaire / non linéaire)
      • Conditions aux limites (linéaire / non linéaire)
      • Calculs thermo-mécaniques (RESO et PASAPAS)
      • Post-traitement.

Site internet de la formation

ANF Évaluation de performance pour les codes HPC

Le processus d’optimisation de code se décompose principalement en deux phases :

  • Mesure de performance
  • Modification du programme dans le but d’améliorer la performance

Cette Action Nationale de Formation (ANF) contribue à la première étape qui est généralement faite de manière incomplète car les outils d’évaluation de performance sont malheureusement encore difficiles à mettre en place et à utiliser. L’objectif est de vous mettre le pied à l’étrier et vous rendre autonomes au maximum dans l’utilisation de ces outils, de les intégrer dans vos codes de calcul, de produire des métriques de performance et d’établir une feuille de route de développement logiciel pour optimiser votre code.

Cet objectif est réaliste grâce à l’intervention des experts du centre d’excellence européen POP qui regroupe les meilleurs experts dans ce domaine au niveau européen et les développeurs des outils Scalasca et Paraver.

Deux parcours :

  • Parcours « Base des outils » sur 2 jours ouvert à 25-30 personnes : présentation des outils Scalasca et Paraver (cours + TP).
  • Parcours « Diagnostic HPC » sur 3 jours supplémentaires et ouvert à 10 personnes : intégration des outils dans vos codes, productions et analyse des métriques de performance et établissement d’une feuille de route d’optimisation.

Prise en charge

Les cours, l’hébergement et les repas sont pris en charge par l’Action Nationale de Formation grâce au soutient financier du centre d’excellence EoCoE et du CNRS. Les frais de transports sont à la charge des participants. Les agents CNRS et membres des UMR associées peuvent s’adresser au service de formation permanente de leur Délégation Régionale pour la prise en charge de leur déplacement.

Un nombre de places limité

Vos motivations, votre projet et le code de calcul sur lequel vous souhaitez travailler, sa maturité et son impact sur les communautés scientifiques sont les critères qui seront pris en compte lors de l’évaluation des candidatures.

Date limite de pré-inscription : 17/05/2019

Réponse mi-juin.

Lien vers le site des inscriptions