megatron.schedules.deallocate_output_tensor#
- megatron.schedules.deallocate_output_tensor(out)#
Pseudo-deallocate (i.e., set to scalar) the output tensor’s ‘.data’ field.
This method should be called right after the output tensor has been sent to the next pipeline stage. At this point, the output tensor is only useful for its ‘.grad_fn’ field, and not its ‘.data’.