espnet2.speechlm.dataloader.batch.batchfy_pack
Less than 1 minute
espnet2.speechlm.dataloader.batch.batchfy_pack
espnet2.speechlm.dataloader.batch.batchfy_pack(keys: List[T], key_to_length: Dict[T, int], batch_token: int) → List[List[T]]
Create batches using pack batching strategy.
Uses Best Fit Decreasing algorithm to maximize batch utilization. Samples are sorted by length (descending) and packed into batches by finding the batch with minimum remaining space that can fit each sample. Batches at 99% capacity are marked as finished.
- Parameters:
- keys – List of sample keys to batch.
- key_to_length – Dictionary mapping each key to its length.
- batch_token – Maximum number of tokens allowed per batch.
- Returns: List of batches, where each batch is a list of keys.
