megatron.model.utils#
Description
Utilities for models.
Functions
| 
 | |
| 
 | Simple linear layer with weight initialization. | 
| 
 | Init method based on N(0, sigma). | 
| 
 | Init method based on N(0, sigma/sqrt(2*num_layers). | 
Description
Utilities for models.
Functions
| 
 | |
| 
 | Simple linear layer with weight initialization. | 
| 
 | Init method based on N(0, sigma). | 
| 
 | Init method based on N(0, sigma/sqrt(2*num_layers). |