Pre-defined Default Breed Classes¶
The Default Breeder implements a standard fitness-proportional active sampling strategy. It maintains a population of simulations and generates new parameters based on their relative "fitness" (e.g., surrogate model loss or error).
How to use it¶
To use the DefaultBreeder, you initialize it with parameters that control the mutation scale and exploration/exploitation balance over time:
-
sigma: The initial mutation covariance. Can be a float (ratio of parameter scale) or an iterable with a specific scale per dimension. -
start&end: The starting and ending breeding ratios (R). These represent the fraction of drawn seeds that will be actively "bred" vs. purely random. -
breakpoint: Number of simulated generations to linearly interpolateRbetweenstartandend. -
use_true_mixing: IfTrue, parent selections uniformly mix proposal methods; otherwise, each is rolled independently.
Implementation Details¶
-
Fitness-Proportional Selection: Parents are chosen to breed probabilistically based on their normalized fitness score.
-
Gaussian Mutation: For bred simulations, children parameters are generated using
scipy.stats.multivariate_normal, centered on the parent's parameters and using an adaptive covariance matrix. -
Boundary Handling: If newly generated parameters fall outside valid bounds, the covariance is aggressively narrowed (
oob_factor = 0.3) and resampling is attempted up tomax_oob_counttimes. If it fails, the generator completely falls back to the parent's original parameter.
melissa.server.deep_learning.active_sampling.breeder.BreedMetadata¶
A Metadata class for breeding.
Attributes¶
- sim_id (
int): Simulation ID. - generation (
int): Resampling step/generation. - is_bred (
bool): A boolean indicating whether the parameters were bred or not. - parameters (
NDArray): An array representing the parameters associated with the object.
melissa.server.deep_learning.active_sampling.breeder.ExperimentBreeder¶
Bases: RandomUniformSamplerMixIn, HaltonSamplerMixIn, LatinHypercubeSamplerMixIn, BaseExperiment
A base class for breeding and sampling experimental parameters. It initializes the parameters based on the chosen sampling method and logs relevant experiment data.
Inherits from multiple mix-ins for sampling and static experiment management.
When inheriting this class, override sample() and optionally, draw().
Parameters¶
- non_breed_sampler_t (
ParameterSamplerType, default=RANDOM_UNIFORM): The type of parameter sampler to use. Can beRANDOM_UNIFORM,HALTON, orLHS. Note that the first generation will be populated from this sampler type.
Attributes¶
- checkpoint_data (
Dict[str, Any]): Stores experiment checkpoint data. - metric_logger (
BaseLogger): Logger for tracking experiment metrics. - _non_breed_sampler_t (
ParameterSamplerType): The sampler type used for non-breeding.
base_sample(nb_samples)
¶
Generates samples using the configured non-breed sampler.
This method dispatches to the correct sampler based on _non_breed_sampler_t.
Without this override, the MRO would always use RandomUniformSamplerMixIn.base_sample().
Parameters¶
- nb_samples (
int): The number of samples to generate.
Returns¶
NDArray: A numpy array of shape(nb_samples, nb_params).
get_non_breed_samples(nb_samples=-1)
¶
Returns the nb_samples using the default (non-breed) sampler.
next_parameters(**kwargs)
abstractmethod
¶
This method is required for producing the next set of parameters i.e the next generation.
It must be called only through melissa.server.deep_learing.active_sampling module and
can be overridden.
Parameters¶
- kwargs (
Dict[str, Any]): A keyword arguments for custom preprocessing.
set_metric_logger(metric_logger)
¶
Sets the experiment logger.
melissa.server.deep_learning.active_sampling.breeder.DefaultBreeder¶
Bases: ExperimentBreeder
A class that extends ExperimentBreeder for managing breeding experiments with more advanced
control over sampling and breeding parameters.
Parameters¶
- sigma (
Union[float, Iterable], default=0.005): Covariance initialization for breeding. - start (
float, default=0.15): Starting breeding ratio. - end (
float, default=0.75): Ending breeding ratio. - breakpoint (
int, default=10): Number of steps in the breeding ratio transition. - use_true_mixing (
bool, default=False): Use true mixing for breeding. - log_extra (
bool, default=False): Log extra information for debugging. - scatter_function (
Callable[[NDArray], Tuple[str, NDArray, str, NDArray]], default=(lambda x: ("x0", x[:,0], "x1", x[:,1])): Scatter function that takes one NDarrayself.parametersof shape(nb_sims, nb_params)and returns two Ndarays of shape (nb_sims,) for plotting scatter on X-Y axis labeled with provided strings. - device (
str, default="cpu"): Device for computation, e.g., "cpu" or "cuda".
Attributes¶
- sigma_opt (
float): Optimal minimum covariance value for breeding. - Rs (
NDArray): Linearly spaced breeding ratios for each breeding step. - R_i (
int): Current breeding ratio index. - R (
float): Current breeding ratio. - oob_factor (
float): Factor by which the covariance decreases when the child is out-of-bounds. - max_oob_count (
int): Maximum allowed attempts for out-of-bounds children. - current_metadata_list (
List[BreedMetadata]): A list of metadata for the current simulations, including covariance, generation, and breeding status. - temp_children_metadata_list (
List[BreedMetadata]): A temporary list of metadata for newly bred children, used before concretizing the parameters. - last_submitted_sim_id (
int): The ID of the last submitted simulation. - parameters_is_bred (
NDArray): A boolean array indicating whether each simulation's parameters were bred. (To be removed in the future.)
set_metric_logger(metric_logger)
¶
Sets the experiment logger.
checkpoint_state()
¶
Saves the current state for checkpointing.
restart_from_checkpoint()
¶
Restores the state from a checkpoint.
concretize_resampled_parameters(last_submitted_sim_id, **kwargs)
¶
Concretizes the current metadata and parameters with newly bred child metadata for the future (unsubmitted) simulations.
Parameters¶
- last_submitted_sim_id (
int): The last submitted simulation id beyond which the updates take place.
Returns¶
Tuple[int, int]: A tuple of the first and last simulation ids for which the updates
took place.
next_parameters(max_breeding_count=-1, **kwargs)
¶
Overrides the parent class sampling method with custom breed-specific arguments.
This method calculates the fitness for each simulation, selects breeding candidates, and returns a new set of parameters based on the custom breeding algorithm.
Note that it does not update the parameters in this call. But, stores them in a temporary list which will be used while concretizing once we obtain the starting simulation id from the server.
Parameters¶
- max_breeding_count (
int, optional): The maximum number of breed iterations. (Default is -1 i.e all remaining).
__breed_algorithm(fitness_per_sim, sim_ids, max_breeding_count=-1)
¶
Breeding algorithm to generate new parameters based on simulation fitness.
This method selects parent simulations based on their fitness scores, performs the breeding process to generate new parameters, and updates the breeding statistics. It also logs various statistics regarding the breeding process with the configured logger.
Parameters¶
- fitness_per_sim (
Union[NDArray, list]): Fitness values corresponding to each simulation. - sim_ids (
Union[NDArray, list]): List or array of simulation ids. - max_breeding_count (
int, optional): The maximum number of parameter sets to breed. (Default is -1 i.e all remaining parameters).
__get_proposal_child(parent_id)
¶
Returns an actively sampled children metadata.
__get_random_child()
¶
Returns a randomly sampled children metadata.
__get_children(parent_idx_list, nb_total_children)
¶
Unoptimized breeding process to generate child parameters by sampling from the parent parameters and their covariance, with an additional check for out-of-bounds conditions. If a child falls out of bounds, its covariance is reduced, and the child is re-sampled.
This method uses scipy.stats.multivariate_normal to perform
sampling per simulation iteratively.
Parameters¶
- parent_idx_list (
NDArray): List of indices for the parent simulations. - nb_total_children (
int): Number of children to breed.
Returns¶
List[BreedMetadata]:
A list containing newly sampled children metadata.