megatron.data.biencoder_dataset_utils#
Description
Classes
|
A struct for fully describing a fixed-size block of data as used in REALM |
|
Functions
|
Get samples mapping for a dataset over fixed size blocks. |
|
|
|
Specifically one epoch to be used in an indexing job. |
|
Join a list of strings, handling spaces appropriately |
|
Returns a 2-dimensional (2-D) attention mask :param source_block: 1-D array :param target_block: 1-D array |