megatron.data.ict_dataset#

Description

Classes

ICTDataset(name, block_dataset, ...[, ...])

Dataset containing sentences and their blocks for an inverse cloze task.

Functions

get_ict_dataset([use_titles, ...])

Get a dataset which uses block samples mappings to get ICT/block indexing data (via get_block()) rather than for training, since it is only built with a single epoch sample mapping.

make_attention_mask(source_block, target_block)

Returns a 2-dimensional (2-D) attention mask :param source_block: 1-D array :param target_block: 1-D array