megatron.text_generation.forward_step.InferenceParams#

class megatron.text_generation.forward_step.InferenceParams(max_batch_size, max_sequence_len)#

Bases: object

Inference parameters that are passed to the main model in order to efficienly calculate and store the context during inference.

swap_key_value_dict(batch_idx)#

swap between batches