espnet2.asr.decoder.mlm_decoder.MLMDecoder
espnet2.asr.decoder.mlm_decoder.MLMDecoder
class espnet2.asr.decoder.mlm_decoder.MLMDecoder(vocab_size: int, encoder_output_size: int, attention_heads: int = 4, linear_units: int = 2048, num_blocks: int = 6, dropout_rate: float = 0.1, positional_dropout_rate: float = 0.1, self_attention_dropout_rate: float = 0.0, src_attention_dropout_rate: float = 0.0, input_layer: str = 'embed', use_output_layer: bool = True, pos_enc_class=<class 'espnet.nets.pytorch_backend.transformer.embedding.PositionalEncoding'>, normalize_before: bool = True, concat_after: bool = False)
Bases: AbsDecoder
Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward(hs_pad: Tensor, hlens: Tensor, ys_in_pad: Tensor, ys_in_lens: Tensor) → Tuple[Tensor, Tensor]
Forward decoder.
Parameters:
- hs_pad – encoded memory, float32 (batch, maxlen_in, feat)
- hlens – (batch)
- ys_in_pad – input token ids, int64 (batch, maxlen_out) if input_layer == “embed” input tensor (batch, maxlen_out, #mels) in the other cases
- ys_in_lens – (batch)
Returns: tuple containing: x: decoded token score before softmax (batch, maxlen_out, token)
if use_output_layer is True,
olens: (batch, )
Return type: (tuple)