megatron.optimizer_param_scheduler#

Description

Learning rate decay and weight decay incr functions.

Classes

OptimizerParamScheduler(optimizer, max_lr, ...)

Anneals learning rate and weight decay