espnet2.train package

espnet2.train.distributed_utils

class espnet2.train.distributed_utils.DistributedOption(distributed: bool = False, dist_backend: str = 'nccl', dist_init_method: str = 'env://', dist_world_size: Union[int, NoneType] = None, dist_rank: Union[int, NoneType] = None, local_rank: Union[int, NoneType] = None, ngpu: int = 0, dist_master_addr: Union[str, NoneType] = None, dist_master_port: Union[int, NoneType] = None, dist_launcher: Union[str, NoneType] = None, multiprocessing_distributed: bool = True)[source]

Bases: object

dist_backend = 'nccl'
dist_init_method = 'env://'
dist_launcher = None
dist_master_addr = None
dist_master_port = None
dist_rank = None
dist_world_size = None
distributed = False
init()[source]
local_rank = None
multiprocessing_distributed = True
ngpu = 0
espnet2.train.distributed_utils.free_port()[source]

Find free port using bind().

There are some interval between finding this port and using it and the other process might catch the port by that time. Thus it is not guaranteed that the port is really empty.

espnet2.train.distributed_utils.get_local_rank(prior=None, launcher: str = None) → Optional[int][source]
espnet2.train.distributed_utils.get_master_addr(prior=None, launcher: str = None) → Optional[str][source]
espnet2.train.distributed_utils.get_master_port(prior=None) → Optional[int][source]
espnet2.train.distributed_utils.get_node_rank(prior=None, launcher: str = None) → Optional[int][source]

Get Node Rank.

Use for “multiprocessing distributed” mode. The initial RANK equals to the Node id in this case and the real Rank is set as (nGPU * NodeID) + LOCAL_RANK in torch.distributed.

espnet2.train.distributed_utils.get_num_nodes(prior=None, launcher: str = None) → Optional[int][source]

Get the number of nodes.

Use for “multiprocessing distributed” mode. RANK equals to the Node id in this case and the real Rank is set as (nGPU * NodeID) + LOCAL_RANK in torch.distributed.

espnet2.train.distributed_utils.get_rank(prior=None, launcher: str = None) → Optional[int][source]
espnet2.train.distributed_utils.get_world_size(prior=None, launcher: str = None) → int[source]
espnet2.train.distributed_utils.is_in_slurm_job() → bool[source]
espnet2.train.distributed_utils.is_in_slurm_step() → bool[source]
espnet2.train.distributed_utils.resolve_distributed_mode(args)[source]

espnet2.train.iterable_dataset

class espnet2.train.iterable_dataset.IterableESPnetDataset(path_name_type_list: Collection[Tuple[str, str, str]], preprocess: Callable[[str, Dict[str, numpy.ndarray]], Dict[str, numpy.ndarray]] = None, float_dtype: str = 'float32', int_dtype: str = 'long', key_file: str = None)[source]

Bases: torch.utils.data.dataset.IterableDataset

Pytorch Dataset class for ESPNet.

Examples

>>> dataset = IterableESPnetDataset([('wav.scp', 'input', 'sound'),
...                                  ('token_int', 'output', 'text_int')],
...                                )
>>> for uid, data in dataset:
...     data
{'input': per_utt_array, 'output': per_utt_array}
has_name(name) → bool[source]
names() → Tuple[str, ...][source]
espnet2.train.iterable_dataset.load_kaldi(input)[source]

espnet2.train.class_choices

class espnet2.train.class_choices.ClassChoices(name: str, classes: Mapping[str, type], type_check: type = None, default: str = None, optional: bool = False)[source]

Bases: object

Helper class to manage the options for variable objects and its configuration.

Example:

>>> class A:
...     def __init__(self, foo=3):  pass
>>> class B:
...     def __init__(self, bar="aaaa"):  pass
>>> choices = ClassChoices("var", dict(a=A, b=B), default="a")
>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> choices.add_arguments(parser)
>>> args = parser.parse_args(["--var", "a", "--var_conf", "foo=4")
>>> args.var
a
>>> args.var_conf
{"foo": 4}
>>> class_obj = choices.get_class(args.var)
>>> a_object = class_obj(**args.var_conf)
add_arguments(parser)[source]
choices() → Tuple[Optional[str], ...][source]
get_class(name: Optional[str]) → Optional[type][source]

espnet2.train.reporter

class espnet2.train.reporter.Average(value: Union[float, int, complex, torch.Tensor, numpy.ndarray])[source]

Bases: espnet2.train.reporter.ReportedValue

class espnet2.train.reporter.ReportedValue[source]

Bases: object

class espnet2.train.reporter.Reporter(epoch: int = 0)[source]

Bases: object

Reporter class.

Examples

>>> reporter = Reporter()
>>> with reporter.observe('train') as sub_reporter:
...     for batch in iterator:
...         stats = dict(loss=0.2)
...         sub_reporter.register(stats)
check_early_stopping(patience: int, key1: str, key2: str, mode: str, epoch: int = None, logger=None) → bool[source]
finish_epoch(sub_reporter: espnet2.train.reporter.SubReporter) → None[source]
get_all_keys(epoch: int = None) → Tuple[Tuple[str, str], ...][source]
get_best_epoch(key: str, key2: str, mode: str, nbest: int = 0) → int[source]
get_epoch() → int[source]
get_keys(epoch: int = None) → Tuple[str, ...][source]

Returns keys1 e.g. train,eval.

get_keys2(key: str, epoch: int = None) → Tuple[str, ...][source]

Returns keys2 e.g. loss,acc.

get_value(key: str, key2: str, epoch: int = None)[source]
has(key: str, key2: str, epoch: int = None) → bool[source]
load_state_dict(state_dict: dict)[source]
log_message(epoch: int = None) → str[source]
matplotlib_plot(output_dir: Union[str, pathlib.Path])[source]

Plot stats using Matplotlib and save images.

observe(key: str, epoch: int = None) → AbstractContextManager[espnet2.train.reporter.SubReporter][source]
set_epoch(epoch: int) → None[source]
sort_epochs(key: str, key2: str, mode: str) → List[int][source]
sort_epochs_and_values(key: str, key2: str, mode: str) → List[Tuple[int, float]][source]

Return the epoch which resulted the best value.

Example

>>> val = reporter.sort_epochs_and_values('eval', 'loss', 'min')
>>> e_1best, v_1best = val[0]
>>> e_2best, v_2best = val[1]
sort_values(key: str, key2: str, mode: str) → List[float][source]
start_epoch(key: str, epoch: int = None) → espnet2.train.reporter.SubReporter[source]
state_dict()[source]
tensorboard_add_scalar(summary_writer: torch.utils.tensorboard.writer.SummaryWriter, epoch: int = None)[source]
wandb_log(epoch: int = None, commit: bool = True)[source]
class espnet2.train.reporter.SubReporter(key: str, epoch: int, total_count: int)[source]

Bases: object

This class is used in Reporter.

See the docstring of Reporter for the usage.

finished() → None[source]
get_epoch() → int[source]
get_total_count() → int[source]

Returns the number of iterations over all epochs.

log_message(start: int = None, end: int = None) → str[source]
measure_iter_time(iterable, name: str)[source]
measure_time(name: str)[source]
next()[source]

Close up this step and reset state for the next step

register(stats: Dict[str, Union[float, int, complex, torch.Tensor, numpy.ndarray, Dict[str, Union[float, int, complex, torch.Tensor, numpy.ndarray]], None]], weight: Union[float, int, complex, torch.Tensor, numpy.ndarray] = None) → None[source]
tensorboard_add_scalar(summary_writer: torch.utils.tensorboard.writer.SummaryWriter, start: int = None)[source]
wandb_log(start: int = None, commit: bool = True)[source]
class espnet2.train.reporter.WeightedAverage(value: Tuple[Union[float, int, complex, torch.Tensor, numpy.ndarray], Union[float, int, complex, torch.Tensor, numpy.ndarray]], weight: Union[float, int, complex, torch.Tensor, numpy.ndarray])[source]

Bases: espnet2.train.reporter.ReportedValue

espnet2.train.reporter.aggregate(values: Sequence[ReportedValue]) → Union[float, int, complex, torch.Tensor, numpy.ndarray][source]
espnet2.train.reporter.to_reported_value(v: Union[float, int, complex, torch.Tensor, numpy.ndarray], weight: Union[float, int, complex, torch.Tensor, numpy.ndarray] = None) → espnet2.train.reporter.ReportedValue[source]

espnet2.train.collate_fn

class espnet2.train.collate_fn.CommonCollateFn(float_pad_value: Union[float, int] = 0.0, int_pad_value: int = -32768, not_sequence: Collection[str] = ())[source]

Bases: object

Functor class of common_collate_fn()

espnet2.train.collate_fn.common_collate_fn(data: Collection[Tuple[str, Dict[str, numpy.ndarray]]], float_pad_value: Union[float, int] = 0.0, int_pad_value: int = -32768, not_sequence: Collection[str] = ()) → Tuple[List[str], Dict[str, torch.Tensor]][source]

Concatenate ndarray-list to an array and convert to torch.Tensor.

Examples

>>> from espnet2.samplers.constant_batch_sampler import ConstantBatchSampler,
>>> import espnet2.tasks.abs_task
>>> from espnet2.train.dataset import ESPnetDataset
>>> sampler = ConstantBatchSampler(...)
>>> dataset = ESPnetDataset(...)
>>> keys = next(iter(sampler)
>>> batch = [dataset[key] for key in keys]
>>> batch = common_collate_fn(batch)
>>> model(**batch)

Note that the dict-keys of batch are propagated from that of the dataset as they are.

espnet2.train.trainer

class espnet2.train.trainer.Trainer[source]

Bases: object

Trainer having a optimizer.

If you’d like to use multiple optimizers, then inherit this class and override the methods if necessary - at least “train_one_epoch()”

>>> class TwoOptimizerTrainer(Trainer):
...     num_optimizers: int = 1
...
...     @classmethod
...     def add_arguments(cls, parser):
...         ...
...
...     @classmethod
...     def train_one_epoch(cls, model, optimizers, ...):
...         loss1 = model.model1(...)
...         loss1.backward()
...         optimizers[0].step()
...
...         loss2 = model.model2(...)
...         loss2.backward()
...         optimizers[1].step()
classmethod add_arguments(parser: argparse.ArgumentParser)[source]

Reserved for future development of another Trainer

classmethod build_options(args: argparse.Namespace) → espnet2.train.trainer.TrainerOptions[source]

Build options consumed by train(), eval(), and plot_attention()

num_optimizers = 1
classmethod plot_attention(model: torch.nn.modules.module.Module, output_dir: Optional[pathlib.Path], summary_writer: Optional[torch.utils.tensorboard.writer.SummaryWriter], iterator: Iterable[Tuple[List[str], Dict[str, torch.Tensor]]], reporter: espnet2.train.reporter.SubReporter, options: espnet2.train.trainer.TrainerOptions) → None[source]
classmethod run(model: espnet2.train.abs_espnet_model.AbsESPnetModel, optimizers: Sequence[torch.optim.optimizer.Optimizer], schedulers: Sequence[Optional[espnet2.schedulers.abs_scheduler.AbsScheduler]], train_iter_factory: espnet2.iterators.abs_iter_factory.AbsIterFactory, valid_iter_factory: espnet2.iterators.abs_iter_factory.AbsIterFactory, plot_attention_iter_factory: Optional[espnet2.iterators.abs_iter_factory.AbsIterFactory], reporter: espnet2.train.reporter.Reporter, scaler: Optional[torch.cuda.amp.grad_scaler.GradScaler], output_dir: pathlib.Path, max_epoch: int, seed: int, patience: Optional[int], keep_nbest_models: int, early_stopping_criterion: Sequence[str], best_model_criterion: Sequence[Sequence[str]], val_scheduler_criterion: Sequence[str], trainer_options, distributed_option: espnet2.train.distributed_utils.DistributedOption, find_unused_parameters: bool = False) → None[source]

Perform training. This method performs the main process of training.

classmethod train_one_epoch(model: torch.nn.modules.module.Module, iterator: Iterable[Tuple[List[str], Dict[str, torch.Tensor]]], optimizers: Sequence[torch.optim.optimizer.Optimizer], schedulers: Sequence[Optional[espnet2.schedulers.abs_scheduler.AbsScheduler]], scaler: Optional[torch.cuda.amp.grad_scaler.GradScaler], reporter: espnet2.train.reporter.SubReporter, summary_writer: Optional[torch.utils.tensorboard.writer.SummaryWriter], options: espnet2.train.trainer.TrainerOptions) → bool[source]
classmethod validate_one_epoch(model: torch.nn.modules.module.Module, iterator: Iterable[Dict[str, torch.Tensor]], reporter: espnet2.train.reporter.SubReporter, options: espnet2.train.trainer.TrainerOptions) → None[source]
class espnet2.train.trainer.TrainerOptions(ngpu: int, train_dtype: str, grad_noise: bool, accum_grad: int, grad_clip: float, grad_clip_type: float, log_interval: Union[int, NoneType], no_forward_run: bool, use_tensorboard: bool, use_wandb: bool)[source]

Bases: object

espnet2.train.abs_espnet_model

class espnet2.train.abs_espnet_model.AbsESPnetModel[source]

Bases: torch.nn.modules.module.Module, abc.ABC

The common abstract class among each tasks

“ESPnetModel” is referred to a class which inherits torch.nn.Module, and makes the dnn-models forward as its member field, a.k.a delegate pattern, and defines “loss”, “stats”, and “weight” for the task.

If you intend to implement new task in ESPNet, the model must inherit this class. In other words, the “mediator” objects between our training system and the your task class are just only these three values, loss, stats, and weight.

Example

>>> from espnet2.tasks.abs_task import AbsTask
>>> class YourESPnetModel(AbsESPnetModel):
...     def forward(self, input, input_lengths):
...         ...
...         return loss, stats, weight
>>> class YourTask(AbsTask):
...     @classmethod
...     def build_model(cls, args: argparse.Namespace) -> YourESPnetModel:

Initializes internal Module state, shared by both nn.Module and ScriptModule.

abstract collect_feats(**batch) → Dict[str, torch.Tensor][source]
abstract forward(**batch) → Tuple[torch.Tensor, Dict[str, torch.Tensor], torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

espnet2.train.preprocessor

class espnet2.train.preprocessor.AbsPreprocessor(train: bool)[source]

Bases: abc.ABC

class espnet2.train.preprocessor.CommonPreprocessor(train: bool, token_type: str = None, token_list: Union[pathlib.Path, str, Iterable[str]] = None, bpemodel: Union[pathlib.Path, str, Iterable[str]] = None, text_cleaner: Collection[str] = None, g2p_type: str = None, unk_symbol: str = '<unk>', space_symbol: str = '<space>', non_linguistic_symbols: Union[pathlib.Path, str, Iterable[str]] = None, delimiter: str = None, speech_name: str = 'speech', text_name: str = 'text')[source]

Bases: espnet2.train.preprocessor.AbsPreprocessor

class espnet2.train.preprocessor.CommonPreprocessor_multi(train: bool, token_type: str = None, token_list: Union[pathlib.Path, str, Iterable[str]] = None, bpemodel: Union[pathlib.Path, str, Iterable[str]] = None, text_cleaner: Collection[str] = None, g2p_type: str = None, unk_symbol: str = '<unk>', space_symbol: str = '<space>', non_linguistic_symbols: Union[pathlib.Path, str, Iterable[str]] = None, delimiter: str = None, speech_name: str = 'speech', text_name: list = ['text'])[source]

Bases: espnet2.train.preprocessor.AbsPreprocessor

espnet2.train.dataset

class espnet2.train.dataset.AbsDataset[source]

Bases: torch.utils.data.dataset.Dataset, abc.ABC

abstract has_name(name) → bool[source]
abstract names() → Tuple[str, ...][source]
class espnet2.train.dataset.AdapterForSoundScpReader(loader, dtype=None)[source]

Bases: collections.abc.Mapping

keys() → a set-like object providing a view on D's keys[source]
class espnet2.train.dataset.ESPnetDataset(path_name_type_list: Collection[Tuple[str, str, str]], preprocess: Callable[[str, Dict[str, numpy.ndarray]], Dict[str, numpy.ndarray]] = None, float_dtype: str = 'float32', int_dtype: str = 'long', max_cache_size: Union[float, int, str] = 0.0, max_cache_fd: int = 0)[source]

Bases: espnet2.train.dataset.AbsDataset

Pytorch Dataset class for ESPNet.

Examples

>>> dataset = ESPnetDataset([('wav.scp', 'input', 'sound'),
...                          ('token_int', 'output', 'text_int')],
...                         )
... uttid, data = dataset['uttid']
{'input': per_utt_array, 'output': per_utt_array}
has_name(name) → bool[source]
names() → Tuple[str, ...][source]
class espnet2.train.dataset.H5FileWrapper(path: str)[source]

Bases: object

espnet2.train.dataset.kaldi_loader(path, float_dtype=None, max_cache_fd: int = 0)[source]
espnet2.train.dataset.rand_int_loader(filepath, loader_type)[source]
espnet2.train.dataset.sound_loader(path, float_dtype=None)[source]

espnet2.train.__init__