megatron.core.tensor_parallel.layers#
Description
Classes
|
Linear layer with column parallelism. |
See linear_with_grad_accumulation_and_async_allreduce |
|
|
Linear layer with row parallelism. |
|
Embedding parallelized in the vocabulary dimension. |
Functions
Linear layer execution with asynchronous communication and gradient accumulation fusion in backprop. |
|
|
|
|