megatron.model.utils#
Description
Utilities for models.
Functions
|
|
|
Simple linear layer with weight initialization. |
|
Init method based on N(0, sigma). |
|
Init method based on N(0, sigma/sqrt(2*num_layers). |
Description
Utilities for models.
Functions
|
|
|
Simple linear layer with weight initialization. |
|
Init method based on N(0, sigma). |
|
Init method based on N(0, sigma/sqrt(2*num_layers). |