Running your first DL study

A first Deep-Learning study¶

This tutorial assumes that the user has gone through Quick Install and Running your first SA study.

Heat-PDE use-case¶

The heat equation is a partial differential equation (PDE) often taught in introductory courses on differential equations. This section demonstrates a Melissa Deep-Learning study involving a parallel MPI simulation using the example of a heat equation solver.

Use case presentation¶

In this example, a finite-difference parallel solver is used to solve the heat equation on a cartesian grid of size $N_x \times N_y$ . The solver input variables are:

$T_0$ the initial temperature across the domain,
$T_1$ , $T_2$ , $T_3$ and $T_4$ , the wall temperatures.

By solving the heat equation for multiple sets of inputs, the purpose of this example is to train a Deep-Surrogate of the solver. By default, the considered network is a multi-layer perceptron with the following architecture:

an input layer of 6 neurons $(T_0, T_1, T_2, T_3, T_4, t_n)$ ,
two hidden layers of 256 neurons each,
an output layer of $N_x \times N_y$ neurons.

Note

The computational load of this use-case can easily be tuned by modifying the client_config dictionary of the configuration file, where the mesh refinement ( $N_x$ , $N_y$ ) and the time discretization can be adjusted by hand with their associated parameters mesh_size and time_discretization.

Running the example¶

If Melissa was installed with the manual installation instructions, the user should remember the Melissa prefix path and update the environment variables of the shell. This will ensure all Melissa executables are found:

source /path/to/melissa/melissa-setup-env.sh

Note

If Melissa was installed with a package manager, there is no need to setup the environment. Loading the API package is enough to adjust the paths as expected.

Next, move to the example folder:

cd /path/to/melissa/examples/heat-pde/

If the user has not previously built the executables, they should be built now. To do this, move to the executables sub-folder and build the executables:

cd executables
mkdir build && cd build
cmake ..
make
cd ../../heat-pde-dl    # go from .../heat-pde/executables/build to .../heat-pde/heat-pde-dl

The third executable is a non-instrumented executable that will be used to generate a reference solution to be compared to the surrogate.

The configuration file config_<scheduler>.json is used to configure the Melissa execution (e.g. parameter sweep, computed statistics, launcher options). It must be edited at least to update the path to the executable:

"client_executable": "path/to/melissa/examples/heat-pde/executables/heatc"

Note

The example can be started with one of several batch schedulers supported by Melissa: OpenMPI, slurm, or OAR. It may be necessary to pass additional arguments directly to the batch scheduler for a successful example run. For example, starting with version 3, OpenMPI refuses to oversubscribe by default (in layman's terms, to start more processes than there are CPUs cores on a system) and requires the `"--oversubscribe" option.

In this tutorial, we use the OpenMPI scheduler and the default config_mpi.json file:

melissa-launcher --config_name /path/to/heat-pde/config_mpi

Note

The heat-pde example is not a computationally challenging problem but simply due to the number of simulation processes and depending on the resources available to the user, the system may end up being oversubscribed. If so, the following launcher option should be added to the config file:

"scheduler-arg": "--oversubscribe"

This will have for effect to submit every mpirun command with this option.

All results, log files, and a copy of the config file will be stored in a dedicated directory called STUDY_OUT. If not specified in the config file, the output directory will by default have the form melissa-YYYYMMDDTHHMMSS where YYYYMMDD and THHMMSS are the current date and local time, respectively, in ISO 8601 basic format. After a successful study, the Melissa server will generate one file model.ckpt containing the trained parameters of the neural networks.

The surrogate can finally be evaluated with the aid of the script plot-results.py. For example, the command below will generate a solution from a new set of inputs and create a gif comparing the exact solution to that inferred with the trained neural network:

python3 plot-results-dl.py /path/to/<result-dir> /path/to/melissa/examples/heat-pde/executables/heat_no_melissac temperature

Lorenz attractor use-case¶

The Lorenz attractor is a set of chaotic solutions of the Lorenz system (cf. Wiki page). In the recent years it has become a famous Deep-Learning problem for the study of chaotic dynamical systems (see Dubois et al. or Chattopadhyay et al. for examples). This section demonstrates a Melissa Deep-Learning study involving a non-parallel MPI simulation using the example of a Lorenz system solver.

Use case presentation¶

Note

This use-case is described in details in this notebook.

In this example, scipy is used to solve the Lorenz system and the solver input variables are:

the system parameter values $(\sigma, \rho, \beta)$ ,
the initial 3D-coordinates of the trajectory $(x_0, y_0, z_0)$ .

By solving the Lorenz system for multiple initial coordinates, the purpose of this example is to train a Deep-Surrogate of the solver i.e. capable of generating the trajectory resulting from any set of initial coordinates and for specific parameter values ( $\rho=30$ , $\beta=2.667$ and $\sigma=10$ ). By default, the considered network is a multi-layer perceptron with the following architecture:

an input layer of 3 neurons $(x_n, y_n, z_n)$ ,
two hidden layers of 512 neurons each,
an output layer of size 3 predicting the time derivative $\frac{x_{n+1}-x_n}{\Delta t}$ of each coordinate ( $\Delta t$ is the time discretization of the data generator).

Note

The use-case is not parallel and its computational load cannot be changed but it can easily be tested at scale even on a local machine.

Running the example¶

For this use-case, the data generator has the following dependencies:

scipy
matplotlib

First, move to the example folder:

cd /path/to/melissa/examples/lorenz

"client_executable": "path/to/melissa/examples/lorenz/lorenz.py"

Note

The example can be started with one of several batch schedulers supported by Melissa: OpenMPI, slurm, or OAR. It may be necessary to pass additional arguments directly to the batch scheduler for a successful example run. For example, starting with version 3, OpenMPI refuses to oversubscribe by default (in layman's terms, to start more processes than there are CPUs cores on a system) and requires the `"--oversubscribe" option.

In this tutorial, we use the OpenMPI scheduler and the default config_mpi.json file:

melissa-launcher --config_name /path/to/lorenz/config_mpi

Note

The Lorenz example is not a computationally challenging problem but simply due to the number of simulation processes and depending on the resources available to the user, the system may end up being oversubscribed. If so, the following launcher option should be added to the config file:

"scheduler-arg": "--oversubscribe"

This will have for effect to submit every mpirun command with this option.

Warning

The Lorenz example exploits the convert_log_to_df feature (see Deeper post-processing) which requires additional dependencies (namely pandas and tensorflow). The user can deactivate it by setting this option to false in config_mpi.json.

The surrogate can finally be evaluated with the aid of the script plot-results.py. For example, the command below will generate several graphs representative of the training quality and of the model preciseness:

python3 plot-results.py /path/to/<result-dir>

Note

If the --coefficients option is used, the script will try to compute two additional evaluation quantities (the Lyapunov exponent and the correlation coefficient) and their corresponding graphs. However, their computation relies on the nolitsa package which must be installed beforehand. Guidelines to do so are available here.