megatron.text_generation.sampling#
Description
Sampling utilities. Part of this code is inspired by:
Functions
|
Set the logits for none top-k values to -inf. |
|
Set the logits for none top-p values to -inf. |
|
Sample and generate a token. Note: logits has the dimension [b, v] where b is the batch size and v is the vocabulary size. If vocab_size is provided, we will make sure the sample that is generated is in [0, vocab-size). This will avoid out of vocabulary generations due to padding. |