TensorBoard logging for Deep Learning¶
For leveraging the default TensorBoard logger, either tensorflow, or torch module must be installed.
Warning
If you are using a different framework. It is preferred that you at least have tensorflow-cpu installed locally.
Logging within a server class¶
Users are encouraged to use the built-in TensorBoard logging feature designed to help users more easily monitor and post-process their deep-learning studies.
As exemplified in examples/heat-pde/heatpde_server.py, the TensorBoard logger is available anywhere in the custom server class under the method self.tb_logger.log_*. Following methods are available to the user,
self.tb_logger.log_scalar("Loss/train", batch_loss, batch_idx)
self.tb_logger.log_scalars("Metrics", metrics_dict, batch_idx)
self.tb_logger.log_figure("Plots/metric", metric_plot_fig, batch_idx)
self.tb_logger.log_histogram("Histograms/dist", dist, batch_idx)
Note
If users want more flexibility, they can access SummaryWriter object through self.tb_logger.writer attribute.
TensorBoard allows you to monitor these values in real-time. To start, open a new terminal and run:
By default, this launches a server at http://localhost:6006. You can now track the training progress in real-time by accessing the TensorBoard dashboard.

Melissa makes use of the TensorBoard logger for a variety of other metrics including:
| Metric | Description | Scope |
|---|---|---|
samples_per_second |
Average number of samples trained per second | Local to MPI rank |
buffer_size |
Size of the buffer at a given time | Local to MPI rank |
put_time |
Time spent to put each sample into the buffer |
Local to MPI rank |
get_time |
Time spent to get each sample from the buffer |
Local to MPI rank |
Additionally, get_buffer_statistics method is implemented in examples/heat-pde/heat-pde-dl/heatpde_dl_server.py to record,
| Metric | Description |
|---|---|
buffer_std/{param} |
Standard deviation of {param} in the buffer |
buffer_mean/{param} |
Mean of {param} in the buffer |
Deeper post-processing¶
Users have the option of automatically generating a pandas dataframe from the TensorBoard logs via a configuration flag convert_log_to_df. By default, it is not set. The dataframe contains all information logged by the function self.tb_logger.log_scalar*.
The following is an example dl_config for users who wish to generate a dataframe from their TensorBoard logs:
Warning
This function requires an additional installation of pandas and tensorflow, which can both be installed via pip with pip install pandas tensorflow. These are, by default, added in deep learning requirements.