espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding

Less than 1 minute

espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding

class espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding(embed_dim: int, dropout: float, max_len: int = 5000, num_layers: int = 1, kernel_size: int = 128, groups: int = 16, weight_norm: str = 'new', use_residual: bool = False)

Bases: Module

Convolutional positional embedding.

Used in wav2vec2/HuBERT SSL models. https://arxiv.org/abs/1904.11660

Parameters:
- embed_dim (int) – Feature dimension of the input Tensor.
- dropout (float) – unused
- max_len (int) – unused
- num_layers (int) – number of conv layers
- kernel_size (int) – The number of frames to be use.
- groups (int) – The number of groups in feature dimensions.
- weight_norm (str) – [new, legacy, none]. How to init conv weights. Recommended setting is none if num_layers > 1.

Initialize Convoluational Positional Embedding.

forward(x)

Forward Method.

Parameters:x (Tensor) – shape [batch, frame, feature].
Returns: The resulting feature. Shape [batch, frame, feature].
Return type: Tensor