Active Sampling Server Classes

melissa.server.deep_learning.active_sampling.active_sampling_server

ExperimentalDeepMelissaActiveSamplingServer

Bases: DeepMelissaServer

Server for active sampling over a predefined number of clients, resampling clients that have not been submitted by the launcher. This server is currently limited to a single-rank server environments.

  • Performs periodic resampling to update simulation parameters.
  • Ensures efficient resource use by managing simulation jobs through a job margin.
  • Utilizes threading for asynchronous resampling events.
  • Tracks batches and offsets for resampling processes.
Parameters
  • config_dict (Dict[str, Any]): A dictionary containing configuration settings for initializing the server.
Attributes
  • ac_config (Dict[str, Any]): Configuration dictionary for active sampling parameters.
  • ac_nn_threshold (int): Number of neural network updates before triggering resampling.
  • ac_sim_threshold (int): Minimum number of simulations remaining to allow resampling.
  • ac_skip_nb_batches (int): Number of training batches to skip before triggering resampling.
  • ac_window_size (int): Specified sliding window size. Used for triggering resampling when these many simulations have finished. Default is -1 and, therefore, it will satisfy each trigger.
  • __resample_event (threading.Event): Event to signal resampling readiness.
  • __resampler_thread (threading.Thread): Thread managing the resampling process.
  • nb_batches_per_resampling (List[int]): List tracking batch counts between resamplings.
  • offset_sim_ids_per_resampling (List[int]): Simulation id offsets per resampling phase.
  • max_breeding_count (int): Limit on the number of simulations bred during resampling.
  • __job_limit (int): Maximum number of jobs the launcher can manage concurrently.
  • __job_margin (int): Margin used for preventing launcher-server inconsistencies.
  • start_sim_id (int): Starting simulation id for the current resampling phase.

job_margin property

A jump to skip future scripts that could overlap i.e avoid inconsistencies while resampling.

_get_current_starting_id property

Returns the simulation id which is a starting point when choosing resampling scripts.

_on_train_start()

Set the TensorboardLogger for the active_sampling module.

_on_batch_end(batch_idx)

Triggers the periodic resampling phase.

set_parameter_sampler(sampler_t, **kwargs)

Sets the defined parameter sampler type. This dictates how parameters are sampled for experiments. This sampler type can either be pre-defined or customized by inheriting a pre-defined sampling class.

This method is overridden for active sampling. It will first ensure if sampler_t is either ParameterSamplerType.DEFAULT_BREED or is a subclass to ExperimentBreeder and call active_sampling.make_parameter_sampler, Otherwise it will default to using regular (non-breed) samplers. This is especially useful when the users want to toggle between different samplers types without changing their codes.

Parameters
  • sampler_t (Union[ParameterSamplerType, Type[ParameterSamplerClass]]):
    • ParameterSamplerType: Enum specifying pre-defined samplers.
    • Type[ParameterSamplerClass]: A class type to instantiate.
  • kwargs (Dict[str, Any]): Dictionary of keyword arguments. Useful to pass custom parameter as well as strict parameter such as l_bounds, u_bounds, apply_pick_freeze, second_order, seed=0, etc.

checkpoint_state()

Checkpoints the current state of the server.

_restart_from_checkpoint()

Restarts the server object from a checkpoint.

periodic_resampling(batch_idx)

Periodically triggers active resampling given the batch id and configured thresholds.

It performs the following tasks:

  • Ensures resampling starts beyond specified ac_skip_nb_batches batches.
  • Ensures resampling starts when at least sliding_window_size number of simulations have finished to obtain fitnesses of correct length.
  • Resampling occurs when batch_idx is divisible by ac_nn_threshold and the server is ready to resample the next generation.
  • Calculates start_sim_id considering conflicts with currently running simulations.
  • Ensures resampling stops if remaining simulations fall below ac_sim_threshold.
  • Adjusts max_breeding_count to prevent exceeding the available simulations.
  • Launches a new thread to asynchronously trigger resampling and pauses for a second.
Parameters
  • batch_idx (int): The current batch index.

__update_batches_per_resampling()

Updates the record of the number of batches processed since the last resampling phase.

Note: This is not an exact value when using the Reservoir buffer type as the batches keep forming as long as there is data available in the buffers.

get_offset_sim_ids_per_resampling()

Retrieves a list of offset simulation ids per resampling phase.

Returns
  • List[int]: A list of offsets.

get_batch_count(phase_id=-1)

Calculates the number of batches processed for a resampling phase.

Parameters
  • phase_id (int): A generation/phase id of the resampling generation. (Default is -1 i.e the most recent.)
Returns
  • int: Number of batches processed.

get_batch_counts_per_resampling()

Calculates the number of batches processed during each resampling phase.

Returns
  • List[int]: A list containing number of batches per resampling phase.

_server_online()

Starts __resampler_thread before running the main server steps.

_server_finalize(exit_=0)

Sets the __resample_event to join __resampler_thread, if it was waiting on it.

Finalizes the server operations.

Parameters
  • exit_ (int, optional): The exit status code indicating the outcome of the server's operations. Defaults to 0, which signifies a successful termination.

__is_resampling_ready()

Checks if the server is ready to trigger the resampling phase.

  • This method determines whether the server is ready for resampling by evaluating if there are enough remaining simulations (above a threshold) to justify initiating the resampling process.
  • The readiness condition depends on the difference between the total submitted simulations and the finished simulations, adjusted by the job limit, and compared to a predefined threshold.
Returns
  • bool: if resampling is ready.

__update_generator()

Manages the active resampling process for the training server. This method waits for the resampling trigger and handles the entire resampling loop. If the minimum simulation criteria are met, it updates the parameter generator and resubmits the simulations scripts with updated parameters.

__resubmit_simulations_with_updated_parameters()

Resubmits simulations with updated parameters for active sampling.

  • Determines the range of simulation ids to resubmit.
  • Resubmits a set of simulations up to max_breeding_count.
  • Calls _generate_client_scripts for each simulation id in the determined range to create client scripts with updated parameters.
  • Tracks the starting simulation id for this resampling phase in offset_sim_ids_per_resampling.
Returns
  • Tuple[int, int]: A tuple containing the starting and ending simulation ids of the resubmitted simulations.

__launcher_state_max_id()

Retrieves the maximum simulation id currently in a running state within the launcher.

  • Ensures that the method identifies the highest job id in a "RUNNING" state.
  • Considers the fragmented state of the launcher, [(4, W), (5, W), (6, R), (7, W), (8, R), || (9, W), (10, W), ...] where jobs are "WAITING" or "RUNNING."
  • The method picks the index beyond the last running simulation id.
Returns
  • int: The maximum simulation id that is currently running.

melissa.server.deep_learning.active_sampling.torch_server

ExperimentalTorchActiveSamplingServer

Bases: TorchServer, ExperimentalDeepMelissaActiveSamplingServer

Extension of TorchServer for active sampling.