Creating a dataset¶
melissa.server.deep_learning.dataset.make_dataset(framework_t, buffer, tb_logger, config_dict, transform)
¶
Factory function to create datasets based on the specified deep learning framework.
This function initializes and returns a dataset object for either PyTorch or TensorFlow based on the provided framework type.
Parameters¶
- framework_t (
FrameworkType
): The type of framework (DEFAULT
,TORCH
orTENSORFLOW
). - buffer (
BaseQueue
): The data buffer to be used by the dataset. - tb_logger (
Any
): A logger for TensorBoard metrics. - config_dict (
Dict[str, Any]
): Configuration dictionary for the dataset. - transform (
Callable
): A transformation function to process data before yielding it.
Returns¶
MelissaIterableDataset
: A dataset object compatible with the specified framework.
Iterable Dataset Classes¶
melissa.server.deep_learning.dataset.MelissaIterableDataset¶
A dataset class designed to handle streaming simulation data through a buffer, with optional data transformations and logging capabilities.
Parameters¶
- buffer (
BaseQueue
): The buffer used for storing and retrieving streaming data. - config_dict (
dict
, optional): Configuration dictionary for initializing dataset-specific parameters. Defaults to an empty dictionary. - transform (
Callable
, optional): A callable transformation function to apply to the data samples. Defaults toNone
. - tb_logger (
TensorboardLogger
, optional): A logger for tracking dataset operations via TensorBoard. Defaults toNone
.
Attributes¶
- buffer (
BaseQueue
): Holds the data samples in a queue for processing. - __tb_logger (
Optional[TensorboardLogger]
): Logs dataset-related events or metrics for TensorBoard visualization. - _is_receiving (
bool
): Indicates whether the dataset is currently receiving data from the buffer. - sample_number (
int
): Tracks the number of samples processed. - config_dict (
Dict[str, Any]
): Stores configuration settings for the dataset. - __transform (
Callable
, optional): Holds the transformation function, if provided. - __transform_lock (
threading.Lock
): Ensures thread-safe application of the transformation function.
has_data
property
¶
Returns if the server is still receiving and the buffer is not empty.
get_sample_number()
¶
Returns the total sample count that were pulled from the buffer and processed. Useful for logging.
signal_reception_over()
¶
Called after reception is done to flush the remaining elements from the buffer.
__iter__()
¶
Infinite iterator which will always try to pull from the buffer as long as the buffer is not empty or the server is still receiving data.
melissa.server.deep_learning.dataset.torch_dataset.TorchMelissaIterableDataset¶
Bases: MelissaIterableDataset
, IterableDataset
A dataset class designed to integrate Melissa's iterable dataset functionality
with PyTorch's IterableDataset
.
This class enables seamless usage of Melissa's streaming simulation data within PyTorch-based deep learning workflows.
melissa.server.deep_learning.dataset.tf_dataset.TfMelissaIterableDataset¶
Bases: MelissaIterableDataset
A TensorFlow-compatible extension of the MelissaIterableDataset.
This class adapts the MelissaIterableDataset to work seamlessly with TensorFlow pipelines. It serves as a bridge between the Melissa distributed data system and TensorFlow, ensuring compatibility and ease of use.