espnet2.mt.frontend.embedding.PatchEmbedding
Less than 1 minute
espnet2.mt.frontend.embedding.PatchEmbedding
class espnet2.mt.frontend.embedding.PatchEmbedding(input_size: int = 400, embed_dim: int = 400, token_per_frame: int = 1, pos_enc_class=<class 'espnet.nets.pytorch_backend.transformer.embedding.PositionalEncoding'>, positional_dropout_rate: float = 0.1)
Bases: AbsFrontend
Embedding Frontend for text based inputs.
Initialize.
- Parameters:
- input_size – Number of input tokens.
- embed_dim – Embedding Size.
- token_per_frame – number of tokens per frame in the input
- pos_enc_class – PositionalEncoding or ScaledPositionalEncoding
- positional_dropout_rate – dropout rate after adding positional encoding
forward(input: Tensor, input_lengths: Tensor) → Tuple[Tensor, Tensor]
Apply a sliding window on the input.
- Parameters:
- input – Input (B, T)
- input_lengths – Input lengths within batch.
- Returns: Output with dimensions (B, T // token_per_frame, D). Tensor: Output lengths within batch, devided by token_per_frame
- Return type: Tensor
output_size() → int
Return output length of feature dimension D, i.e. the embedding dim.