espnet.nets.pytorch_backend.e2e_asr_maskctc.E2E
Less than 1 minute
espnet.nets.pytorch_backend.e2e_asr_maskctc.E2E
class espnet.nets.pytorch_backend.e2e_asr_maskctc.E2E(idim, odim, args, ignore_id=-1)
Bases: E2E
E2E module.
- Parameters:
- idim (int) – dimension of inputs
- odim (int) – dimension of outputs
- args (Namespace) – argument Namespace containing options
Construct an E2E object.
- Parameters:
- idim (int) – dimension of inputs
- odim (int) – dimension of outputs
- args (Namespace) – argument Namespace containing options
static add_arguments(parser)
Add arguments.
static add_maskctc_arguments(parser)
Add arguments for maskctc model.
forward(xs_pad, ilens, ys_pad)
E2E forward.
- Parameters:
- xs_pad (torch.Tensor) – batch of padded source sequences (B, Tmax, idim)
- ilens (torch.Tensor) – batch of lengths of source sequences (B)
- ys_pad (torch.Tensor) – batch of padded target sequences (B, Lmax)
- Returns: ctc loss value
- Return type: torch.Tensor
- Returns: attention loss value
- Return type: torch.Tensor
- Returns: accuracy in attention decoder
- Return type: float
recognize(x, recog_args, char_list=None, rnnlm=None)
Recognize input speech.
- Parameters:
- x (ndnarray) – input acoustic feature (B, T, D) or (T, D)
- recog_args (Namespace) – argment Namespace contraining options
- char_list (list) – list of characters
- rnnlm (torch.nn.Module) – language model module
- Returns: decoding result
- Return type: list