espnet2.utils package

espnet2.utils.eer

Source code from: https://github.com/clovaai/voxceleb_trainer/blob/master/tuneThreshold.py

espnet2.utils.eer.ComputeErrorRates(scores, labels)[source]
espnet2.utils.eer.ComputeMinDcf(fnrs, fprs, thresholds, p_target, c_miss, c_fa)[source]
espnet2.utils.eer.tuneThresholdfromScore(scores, labels, target_fa, target_fr=None)[source]

espnet2.utils.kwargs2args

espnet2.utils.kwargs2args.func(a: int, b, *, c, **kwargs)[source]
espnet2.utils.kwargs2args.kwargs2args(func, kwargs)[source]

espnet2.utils.types

espnet2.utils.types.float_or_none(value: str) → Optional[float][source]

float_or_none.

Examples

>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> _ = parser.add_argument('--foo', type=float_or_none)
>>> parser.parse_args(['--foo', '4.5'])
Namespace(foo=4.5)
>>> parser.parse_args(['--foo', 'none'])
Namespace(foo=None)
>>> parser.parse_args(['--foo', 'null'])
Namespace(foo=None)
>>> parser.parse_args(['--foo', 'nil'])
Namespace(foo=None)
espnet2.utils.types.humanfriendly_parse_size_or_none(value) → Optional[float][source]
espnet2.utils.types.int_or_none(value: str) → Optional[int][source]

int_or_none.

Examples

>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> _ = parser.add_argument('--foo', type=int_or_none)
>>> parser.parse_args(['--foo', '456'])
Namespace(foo=456)
>>> parser.parse_args(['--foo', 'none'])
Namespace(foo=None)
>>> parser.parse_args(['--foo', 'null'])
Namespace(foo=None)
>>> parser.parse_args(['--foo', 'nil'])
Namespace(foo=None)
espnet2.utils.types.remove_parenthesis(value: str)[source]
espnet2.utils.types.remove_quotes(value: str)[source]
espnet2.utils.types.str2bool(value: str) → bool[source]
espnet2.utils.types.str2pair_str(value: str) → Tuple[str, str][source]

str2pair_str.

Examples

>>> import argparse
>>> str2pair_str('abc,def ')
('abc', 'def')
>>> parser = argparse.ArgumentParser()
>>> _ = parser.add_argument('--foo', type=str2pair_str)
>>> parser.parse_args(['--foo', 'abc,def'])
Namespace(foo=('abc', 'def'))
espnet2.utils.types.str2triple_str(value: str) → Tuple[str, str, str][source]

str2triple_str.

Examples

>>> str2triple_str('abc,def ,ghi')
('abc', 'def', 'ghi')
espnet2.utils.types.str_or_int(value: str) → Union[str, int][source]
espnet2.utils.types.str_or_none(value: str) → Optional[str][source]

str_or_none.

Examples

>>> import argparse
>>> parser = argparse.ArgumentParser()
>>> _ = parser.add_argument('--foo', type=str_or_none)
>>> parser.parse_args(['--foo', 'aaa'])
Namespace(foo='aaa')
>>> parser.parse_args(['--foo', 'none'])
Namespace(foo=None)
>>> parser.parse_args(['--foo', 'null'])
Namespace(foo=None)
>>> parser.parse_args(['--foo', 'nil'])
Namespace(foo=None)

espnet2.utils.yaml_no_alias_safe_dump

class espnet2.utils.yaml_no_alias_safe_dump.NoAliasSafeDumper(stream, default_style=None, default_flow_style=False, canonical=None, indent=None, width=None, allow_unicode=None, line_break=None, encoding=None, explicit_start=None, explicit_end=None, version=None, tags=None, sort_keys=True)[source]

Bases: yaml.dumper.SafeDumper

ignore_aliases(data)[source]
espnet2.utils.yaml_no_alias_safe_dump.yaml_no_alias_safe_dump(data, stream=None, **kwargs)[source]

Safe-dump in yaml with no anchor/alias

espnet2.utils.__init__

espnet2.utils.nested_dict_action

class espnet2.utils.nested_dict_action.NestedDictAction(option_strings, dest, nargs=None, default=None, choices=None, required=False, help=None, metavar=None)[source]

Bases: argparse.Action

Action class to append items to dict object.

Examples

>>> parser = argparse.ArgumentParser()
>>> _ = parser.add_argument('--conf', action=NestedDictAction,
...                         default={'a': 4})
>>> parser.parse_args(['--conf', 'a=3', '--conf', 'c=4'])
Namespace(conf={'a': 3, 'c': 4})
>>> parser.parse_args(['--conf', 'c.d=4'])
Namespace(conf={'a': 4, 'c': {'d': 4}})
>>> parser.parse_args(['--conf', 'c.d=4', '--conf', 'c=2'])
Namespace(conf={'a': 4, 'c': 2})
>>> parser.parse_args(['--conf', '{d: 5, e: 9}'])
Namespace(conf={'d': 5, 'e': 9})

espnet2.utils.config_argparse

class espnet2.utils.config_argparse.ArgumentParser(*args, **kwargs)[source]

Bases: argparse.ArgumentParser

Simple implementation of ArgumentParser supporting config file

This class is originated from https://github.com/bw2/ConfigArgParse, but this class is lack of some features that it has.

  • Not supporting multiple config files

  • Automatically adding “–config” as an option.

  • Not supporting any formats other than yaml

  • Not checking argument type

parse_known_args(args=None, namespace=None)[source]

espnet2.utils.griffin_lim

Griffin-Lim related modules.

class espnet2.utils.griffin_lim.Spectrogram2Waveform(n_fft: int, n_shift: int, fs: Optional[int] = None, n_mels: Optional[int] = None, win_length: Optional[int] = None, window: Optional[str] = 'hann', fmin: Optional[int] = None, fmax: Optional[int] = None, griffin_lim_iters: Optional[int] = 8)[source]

Bases: object

Spectrogram to waveform conversion module.

Initialize module.

Parameters:
  • fs – Sampling frequency.

  • n_fft – The number of FFT points.

  • n_shift – Shift size in points.

  • n_mels – The number of mel basis.

  • win_length – Window length in points.

  • window – Window function type.

  • f_min – Minimum frequency to analyze.

  • f_max – Maximum frequency to analyze.

  • griffin_lim_iters – The number of iterations.

espnet2.utils.griffin_lim.griffin_lim(spc: numpy.ndarray, n_fft: int, n_shift: int, win_length: Optional[int] = None, window: Optional[str] = 'hann', n_iter: Optional[int] = 32) → numpy.ndarray[source]

Convert linear spectrogram into waveform using Griffin-Lim.

Parameters:
  • spc – Linear spectrogram (T, n_fft // 2 + 1).

  • n_fft – The number of FFT points.

  • n_shift – Shift size in points.

  • win_length – Window length in points.

  • window – Window function type.

  • n_iter – The number of iterations.

Returns:

Reconstructed waveform (N,).

espnet2.utils.griffin_lim.logmel2linear(lmspc: numpy.ndarray, fs: int, n_fft: int, n_mels: int, fmin: Optional[int] = None, fmax: Optional[int] = None) → numpy.ndarray[source]

Convert log Mel filterbank to linear spectrogram.

Parameters:
  • lmspc – Log Mel filterbank (T, n_mels).

  • fs – Sampling frequency.

  • n_fft – The number of FFT points.

  • n_mels – The number of mel basis.

  • f_min – Minimum frequency to analyze.

  • f_max – Maximum frequency to analyze.

Returns:

Linear spectrogram (T, n_fft // 2 + 1).

espnet2.utils.build_dataclass

espnet2.utils.build_dataclass.build_dataclass(dataclass, args: argparse.Namespace)[source]

Helper function to build dataclass from ‘args’.

espnet2.utils.sized_dict

class espnet2.utils.sized_dict.SizedDict(shared: bool = False, data: dict = None)[source]

Bases: collections.abc.MutableMapping

espnet2.utils.sized_dict.get_size(obj, seen=None)[source]

Recursively finds size of objects

Taken from https://github.com/bosswissam/pysize

espnet2.utils.get_default_kwargs

class espnet2.utils.get_default_kwargs.Invalid[source]

Bases: object

Marker object for not serializable-object

espnet2.utils.get_default_kwargs.get_default_kwargs(func)[source]

Get the default values of the input function.

Examples

>>> def func(a, b=3):  pass
>>> get_default_kwargs(func)
{'b': 3}