Active Sampling Server Classes¶
melissa.server.deep_learning.active_sampling.active_sampling_server¶
ExperimentalDeepMelissaActiveSamplingServer¶
Bases: DeepMelissaServer
Server for active sampling over a predefined number of clients, resampling clients that have not been submitted by the launcher. This server is currently limited to a single-rank server environments.
- Performs periodic resampling to update simulation parameters.
- Ensures efficient resource use by managing simulation jobs through a job margin.
- Utilizes threading for asynchronous resampling events.
- Tracks batches and offsets for resampling processes.
Parameters¶
- config_dict (
Dict[str, Any]
): A dictionary containing configuration settings for initializing the server.
Attributes¶
- ac_config (
Dict[str, Any]
): Configuration dictionary for active sampling parameters. - ac_nn_threshold (
int
): Number of neural network updates before triggering resampling. - ac_sim_threshold (
int
): Minimum number of simulations remaining to allow resampling. - ac_skip_nb_batches (
int
): Number of training batches to skip before triggering resampling. - ac_window_size (
int
): Specified sliding window size. Used for triggering resampling when these many simulations have finished. Default is -1 and, therefore, it will satisfy each trigger. - __resample_event (
threading.Event
): Event to signal resampling readiness. - __resampler_thread (
threading.Thread
): Thread managing the resampling process. - nb_batches_per_resampling (
List[int]
): List tracking batch counts between resamplings. - offset_sim_ids_per_resampling (
List[int]
): Simulation id offsets per resampling phase. - max_breeding_count (
int
): Limit on the number of simulations bred during resampling. - __job_limit (
int
): Maximum number of jobs the launcher can manage concurrently. - __job_margin (
int
): Margin used for preventing launcher-server inconsistencies. - start_sim_id (
int
): Starting simulation id for the current resampling phase.
job_margin
property
¶
A jump to skip future scripts that could overlap i.e avoid inconsistencies while resampling.
_get_current_starting_id
property
¶
Returns the simulation id which is a starting point when choosing resampling scripts.
_on_train_start()
¶
Set the TensorboardLogger
for the active_sampling
module.
_on_batch_end(batch_idx)
¶
Triggers the periodic resampling phase.
set_parameter_sampler(sampler_t, **kwargs)
¶
Sets the defined parameter sampler type. This dictates how parameters are sampled for experiments. This sampler type can either be pre-defined or customized by inheriting a pre-defined sampling class.
This method is overridden for active sampling. It will first ensure if sampler_t
is either ParameterSamplerType.DEFAULT_BREED
or is a subclass to ExperimentBreeder
and
call active_sampling.make_parameter_sampler
, Otherwise it will default to using
regular (non-breed) samplers. This is especially useful when the users want to toggle
between different samplers types without changing their codes.
Parameters¶
- sampler_t (
Union[ParameterSamplerType, Type[ParameterSamplerClass]]
):ParameterSamplerType
: Enum specifying pre-defined samplers.Type[ParameterSamplerClass]
: A class type to instantiate.
- kwargs (
Dict[str, Any]
): Dictionary of keyword arguments. Useful to pass custom parameter as well as strict parameter such asl_bounds
,u_bounds
,apply_pick_freeze
,second_order
,seed=0
, etc.
checkpoint_state()
¶
Checkpoints the current state of the server.
_restart_from_checkpoint()
¶
Restarts the server object from a checkpoint.
periodic_resampling(batch_idx)
¶
Periodically triggers active resampling given the batch id and configured thresholds.
It performs the following tasks:
- Ensures resampling starts beyond specified
ac_skip_nb_batches
batches. - Ensures resampling starts when at least
sliding_window_size
number of simulations have finished to obtain fitnesses of correct length. - Resampling occurs when
batch_idx
is divisible byac_nn_threshold
and the server is ready to resample the next generation. - Calculates
start_sim_id
considering conflicts with currently running simulations. - Ensures resampling stops if remaining simulations fall below
ac_sim_threshold
. - Adjusts
max_breeding_count
to prevent exceeding the available simulations. - Launches a new thread to asynchronously trigger resampling and pauses for a second.
Parameters¶
- batch_idx (
int
): The current batch index.
__update_batches_per_resampling()
¶
Updates the record of the number of batches processed since the last resampling phase.
Note: This is not an exact value when using the Reservoir
buffer type as the batches
keep forming as long as there is data available in the buffers.
get_offset_sim_ids_per_resampling()
¶
Retrieves a list of offset simulation ids per resampling phase.
Returns¶
List[int]
: A list of offsets.
get_batch_count(phase_id=-1)
¶
get_batch_counts_per_resampling()
¶
Calculates the number of batches processed during each resampling phase.
Returns¶
List[int]
: A list containing number of batches per resampling phase.
_server_online()
¶
Starts __resampler_thread
before running the main server steps.
_server_finalize(exit_=0)
¶
Sets the __resample_event
to join __resampler_thread
, if it was waiting on it.
Finalizes the server operations.
Parameters¶
- exit_ (
int
, optional): The exit status code indicating the outcome of the server's operations. Defaults to 0, which signifies a successful termination.
__is_resampling_ready()
¶
Checks if the server is ready to trigger the resampling phase.
- This method determines whether the server is ready for resampling by evaluating if there are enough remaining simulations (above a threshold) to justify initiating the resampling process.
- The readiness condition depends on the difference between the total submitted simulations and the finished simulations, adjusted by the job limit, and compared to a predefined threshold.
Returns¶
bool
: if resampling is ready.
__update_generator()
¶
Manages the active resampling process for the training server. This method waits for the resampling trigger and handles the entire resampling loop. If the minimum simulation criteria are met, it updates the parameter generator and resubmits the simulations scripts with updated parameters.
__resubmit_simulations_with_updated_parameters()
¶
Resubmits simulations with updated parameters for active sampling.
- Determines the range of simulation ids to resubmit.
- Resubmits a set of simulations up to
max_breeding_count
. - Calls
_generate_client_scripts
for each simulation id in the determined range to create client scripts with updated parameters. - Tracks the starting simulation id for this resampling phase in
offset_sim_ids_per_resampling
.
Returns¶
Tuple[int, int]
: A tuple containing the starting and ending simulation ids of the resubmitted simulations.
__launcher_state_max_id()
¶
Retrieves the maximum simulation id currently in a running state within the launcher.
- Ensures that the method identifies the highest job id in a "RUNNING" state.
- Considers the fragmented state of the launcher, [(4, W), (5, W), (6, R), (7, W), (8, R), || (9, W), (10, W), ...] where jobs are "WAITING" or "RUNNING."
- The method picks the index beyond the last running simulation id.
Returns¶
int
: The maximum simulation id that is currently running.
melissa.server.deep_learning.active_sampling.torch_server¶
ExperimentalTorchActiveSamplingServer¶
Bases: TorchServer
, ExperimentalDeepMelissaActiveSamplingServer
Extension of TorchServer
for active sampling.