megatron.schedules.forward_backward_pipelining_with_interleaving#
- megatron.schedules.forward_backward_pipelining_with_interleaving(forward_step_func, data_iterator, model, optimizer, timers, forward_only, collect_non_loss_data=False)#
Run interleaved 1F1B schedule (model split into model chunks), with communication between pipeline stages as needed.
Returns dictionary with losses if the last stage, empty dict otherwise.