megatron.schedules#

Description

Functions

backward_step(optimizer, input_tensor, ...)

Backward step through passed-in output tensor.

custom_backward(output, grad_output)

Directly call C++ autograd engine.

deallocate_output_tensor(out)

Pseudo-deallocate (i.e., set to scalar) the output tensor's '.data' field.

dummy_handler()

forward_backward_no_pipelining(...[, ...])

Run forward and backward passes with no pipeline parallelism (no inter-stage communication).

forward_backward_pipelining_with_interleaving(...)

Run interleaved 1F1B schedule (model split into model chunks), with communication between pipeline stages as needed.

forward_backward_pipelining_without_interleaving(...)

Run non-interleaved 1F1B schedule, with communication between pipeline stages.

forward_step(forward_step_func, ...[, ...])

Forward step for passed-in model.

get_forward_backward_func()

get_tensor_shapes(rank, model_type)

recv_backward(tensor_shapes, timers)

recv_forward(tensor_shapes, timers)

send_backward(input_tensor_grads, ...)

send_backward_recv_forward(...)

send_forward(output_tensors, tensor_shapes, ...)

send_forward_recv_backward(output_tensors, ...)