Creating a dataset¶
melissa.server.deep_learning.dataset.make_dataset(framework_t, buffer, tb_logger, config_dict, transform)
¶
Factory function to create datasets based on the specified deep learning framework.
This function initializes and returns a dataset object for either PyTorch or TensorFlow based on the provided framework type.
Parameters¶
- framework_t (
FrameworkType): The type of framework (DEFAULT,TORCHorTENSORFLOW). - buffer (
BaseQueue): The data buffer to be used by the dataset. - tb_logger (
Any): A logger for TensorBoard metrics. - config_dict (
Dict[str, Any]): Configuration dictionary for the dataset. - transform (
Callable): A transformation function to process data before yielding it.
Returns¶
MelissaIterableDataset: A dataset object compatible with the specified framework.
Iterable Dataset Classes¶
melissa.server.deep_learning.dataset.MelissaIterableDataset¶
A dataset class designed to handle streaming simulation data through a buffer, with optional data transformations and logging capabilities.
Parameters¶
- buffer (
BaseQueue): The buffer used for storing and retrieving streaming data. - config_dict (
dict, optional): Configuration dictionary for initializing dataset-specific parameters. Defaults to an empty dictionary. - transform (
Callable, optional): A callable transformation function to apply to the data samples. Defaults toNone. - tb_logger (
TensorboardLogger, optional): A logger for tracking dataset operations via TensorBoard. Defaults toNone.
Attributes¶
- buffer (
BaseQueue): Holds the data samples in a queue for processing. - __tb_logger (
Optional[TensorboardLogger]): Logs dataset-related events or metrics for TensorBoard visualization. - _is_receiving (
bool): Indicates whether the dataset is currently receiving data from the buffer. - sample_number (
int): Tracks the number of samples processed. - config_dict (
Dict[str, Any]): Stores configuration settings for the dataset. - __transform (
Callable, optional): Holds the transformation function, if provided. - __transform_lock (
threading.Lock): Ensures thread-safe application of the transformation function.
has_data
property
¶
Returns if the server is still receiving and the buffer is not empty.
get_sample_number()
¶
Returns the total sample count that were pulled from the buffer and processed. Useful for logging.
signal_reception_over()
¶
Called after reception is done to flush the remaining elements from the buffer.
__iter__()
¶
Infinite iterator which will always try to pull from the buffer as long as the buffer is not empty or the server is still receiving data.
melissa.server.deep_learning.dataset.torch_dataset.TorchMelissaIterableDataset¶
Bases: MelissaIterableDataset, IterableDataset
A dataset class designed to integrate Melissa's iterable dataset functionality
with PyTorch's IterableDataset.
This class enables seamless usage of Melissa's streaming simulation data within PyTorch-based deep learning workflows.
melissa.server.deep_learning.dataset.tf_dataset.TfMelissaIterableDataset¶
Bases: MelissaIterableDataset
A TensorFlow-compatible extension of the MelissaIterableDataset.
This class adapts the MelissaIterableDataset to work seamlessly with TensorFlow pipelines. It serves as a bridge between the Melissa distributed data system and TensorFlow, ensuring compatibility and ease of use.