espnet.utils.training.batchfy.batchfy_by_bin
Less than 1 minute
espnet.utils.training.batchfy.batchfy_by_bin
espnet.utils.training.batchfy.batchfy_by_bin(sorted_data, batch_bins, num_batches=0, min_batch_size=1, shortest_first=False, ikey='input', okey='output')
Make variably sized batch set, which maximizes.
the number of bins up to batch_bins.
- Parameters:
- sorted_data (Dict *[*str , Dict *[*str , Any ] ]) – dictionary loaded from data.json
- batch_bins (int) – Maximum frames of a batch
- num_batches (int) – # number of batches to use (for debug)
- min_batch_size (int) – minimum batch size (for multi-gpu)
- test (int) – Return only every test batches
- shortest_first (bool) – Sort from batch with shortest samples to longest if true, otherwise reverse
- ikey (str) – key to access input (for ASR ikey=”input”, for TTS ikey=”output”.)
- okey (str) – key to access output (for ASR okey=”output”. for TTS okey=”input”.)
- Returns: List[Tuple[str, Dict[str, List[Dict[str, Any]]]] list of batches