espnet2.mt package

espnet2.mt.espnet_model

class espnet2.mt.espnet_model.ESPnetMTModel(vocab_size: int, token_list: Union[Tuple[str, ...], List[str]], frontend: Optional[espnet2.asr.frontend.abs_frontend.AbsFrontend], preencoder: Optional[espnet2.asr.preencoder.abs_preencoder.AbsPreEncoder], encoder: espnet2.asr.encoder.abs_encoder.AbsEncoder, postencoder: Optional[espnet2.asr.postencoder.abs_postencoder.AbsPostEncoder], decoder: espnet2.asr.decoder.abs_decoder.AbsDecoder, src_vocab_size: int = 0, src_token_list: Union[Tuple[str, ...], List[str]] = [], ignore_id: int = -1, lsm_weight: float = 0.0, length_normalized_loss: bool = False, report_bleu: bool = True, sym_space: str = '<space>', sym_blank: str = '<blank>', extract_feats_in_collect_stats: bool = True, share_decoder_input_output_embed: bool = False, share_encoder_decoder_input_embed: bool = False)[source]

Bases: espnet2.train.abs_espnet_model.AbsESPnetModel

Encoder-Decoder model

collect_feats(text: torch.Tensor, text_lengths: torch.Tensor, src_text: torch.Tensor, src_text_lengths: torch.Tensor, **kwargs) → Dict[str, torch.Tensor][source]
encode(src_text: torch.Tensor, src_text_lengths: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]

Frontend + Encoder. Note that this method is used by mt_inference.py

Parameters:
  • src_text – (Batch, Length, …)

  • src_text_lengths – (Batch, )

forward(text: torch.Tensor, text_lengths: torch.Tensor, src_text: torch.Tensor, src_text_lengths: torch.Tensor, **kwargs) → Tuple[torch.Tensor, Dict[str, torch.Tensor], torch.Tensor][source]

Frontend + Encoder + Decoder + Calc loss

Parameters:
  • text – (Batch, Length)

  • text_lengths – (Batch,)

  • src_text – (Batch, length)

  • src_text_lengths – (Batch,)

  • kwargs – “utt_id” is among the input.

espnet2.mt.__init__

espnet2.mt.frontend.__init__

espnet2.mt.frontend.embedding

Embedding Frontend for text based inputs.

class espnet2.mt.frontend.embedding.Embedding(input_size: int = 400, embed_dim: int = 400, pos_enc_class=<class 'espnet.nets.pytorch_backend.transformer.embedding.PositionalEncoding'>, positional_dropout_rate: float = 0.1)[source]

Bases: espnet2.asr.frontend.abs_frontend.AbsFrontend

Embedding Frontend for text based inputs.

Initialize.

Parameters:
  • input_size – Number of input tokens.

  • embed_dim – Embedding Size.

  • pos_enc_class – PositionalEncoding or ScaledPositionalEncoding

  • positional_dropout_rate – dropout rate after adding positional encoding

forward(input: torch.Tensor, input_lengths: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor][source]

Apply a sliding window on the input.

Parameters:
  • input – Input (B, T) or (B, T,D), with D.

  • input_lengths – Input lengths within batch.

Returns:

Output with dimensions (B, T, D). Tensor: Output lengths within batch.

Return type:

Tensor

output_size() → int[source]

Return output length of feature dimension D, i.e. the embedding dim.