espnet2.speechlm.dataloader.batch.batchfy_pack
Less than 1 minute
espnet2.speechlm.dataloader.batch.batchfy_pack
espnet2.speechlm.dataloader.batch.batchfy_pack(keys: List[T], key_to_length: Dict[T, int], batch_token: int) → List[List[T]]
Create batches using diverse Best Fit (parallel with 8 workers).
Uses stratified interleaving to ensure length diversity within batches while maintaining high packing efficiency.
- Parameters:
- keys – List of sample keys to batch.
- key_to_length – Dictionary mapping each key to its length.
- batch_token – Maximum number of tokens allowed per batch.
- Returns: List of batches, where each batch is a list of keys.
