megatron.data.indexed_dataset.MMapIndexedDataset#

class megatron.data.indexed_dataset.MMapIndexedDataset(path, skip_warmup=False)#

Bases: Dataset

get(idx, offset=0, length=None)#

Retrieves a single item from the dataset with the option to only return a portion of the item.

get(idx) is the same as [idx] but get() does not support slicing.