megatron.data.t5_dataset#

Description

T5 Style dataset.

Classes

T5Dataset(name, indexed_dataset, ...)

Functions

build_training_sample(sample, ...[, bos_id, ...])

Build training sample.

make_attention_mask(source_block, target_block)

Returns a 2-dimensional (2-D) attention mask :param source_block: 1-D array :param target_block: 1-D array

make_attention_mask_3d(source_block, ...)

Returns a 3-dimensional (3-D) attention mask :param source_block: 1-D array :param target_block: 1-D array

make_history_mask(block)

make_history_mask_3d(block)

pad_and_convert_to_numpy(tokens, ...[, ...])

Pad sequences and convert them to numpy.