espnet2.speechlm.dataloader.batch.synchronize_batches

Less than 1 minute

espnet2.speechlm.dataloader.batch.synchronize_batches

espnet2.speechlm.dataloader.batch.synchronize_batches(batches: List[List[T]]) → List[List[T]]

Synchronize batches across all GPU ranks in distributed training.

Ensures all GPU ranks have the same number of batches by duplicating the last few batches on ranks with fewer batches. This is useful for distributed training where each rank may have different numbers of batches due to data sharding.

Parameters:batches – List of batches to synchronize.
Returns: Synchronized list of batches with duplicates added if necessary.

Notes

If torch.distributed is not initialized, returns unchanged
If CUDA is not available, returns batches unchanged
Duplicates are taken from the end of the batch list