megatron.core.tensor_parallel.layers#

Description

Classes

ColumnParallelLinear(input_size, output_size, *)

Linear layer with column parallelism.

LinearWithGradAccumulationAndAsyncCommunication(...)

See linear_with_grad_accumulation_and_async_allreduce

RowParallelLinear(input_size, output_size, *)

Linear layer with row parallelism.

VocabParallelEmbedding(num_embeddings, ...)

Embedding parallelized in the vocabulary dimension.

Functions

copy_tensor_model_parallel_attributes(...)

linear_with_grad_accumulation_and_async_allreduce(...)

Linear layer execution with asynchronous communication and gradient accumulation fusion in backprop.

param_is_not_tensor_parallel_duplicate(param)

set_defaults_if_not_set_tensor_model_parallel_attributes(tensor)

set_tensor_model_parallel_attributes(tensor, ...)