megatron.model.utils.scaled_init_method_normal#

megatron.model.utils.scaled_init_method_normal(sigma, num_layers)#

Init method based on N(0, sigma/sqrt(2*num_layers).