espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding
Less than 1 minute
espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding
class espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding(embed_dim: int, dropout: float, max_len: int = 5000, num_layers: int = 1, kernel_size: int = 128, groups: int = 16, weight_norm: str = 'new', use_residual: bool = False)
Bases: Module
Convolutional positional embedding.
Used in wav2vec2/HuBERT SSL models. https://arxiv.org/abs/1904.11660
- Parameters:
- embed_dim (int) – Feature dimension of the input Tensor.
- dropout (float) – unused
- max_len (int) – unused
- num_layers (int) – number of conv layers
- kernel_size (int) – The number of frames to be use.
- groups (int) – The number of groups in feature dimensions.
- weight_norm (str) – [new, legacy, none]. How to init conv weights. Recommended setting is none if num_layers > 1.
Initialize Convoluational Positional Embedding.
forward(x)
Forward Method.
- Parameters:x (Tensor) – shape
[batch, frame, feature]
. - Returns: The resulting feature. Shape
[batch, frame, feature]
. - Return type: Tensor