megatron.utils#
Description
General utilities.
Functions
Reduce a tensor of losses across all GPUs. |
|
|
Calculate l2 norm of parameters |
|
Check for autoresume signal and exit if it is received. |
|
Build masks and position id for left to right model. |
|
If distributed is initialized, print on the last rank in all nodes. |
|
Print min, max, and norm of all parameters. |
|
If distributed is initialized, print only on rank 0. |
|
If distributed is initialized, print only on last rank. |
|
Simple GPU memory report. |
|