espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding
Less than 1 minute
espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding
class espnet.nets.pytorch_backend.transformer.embedding.ConvolutionalPositionalEmbedding(embed_dim: int, dropout: float, max_len: int = 5000, num_layers: int = 1, kernel_size: int = 128, groups: int = 16, weight_norm: str = 'new')
Bases: Module
Convolutional positional embedding. : Used in wav2vec2/HuBERT SSL models. https://arxiv.org/abs/1904.11660
- Parameters:
- embed_dim (int) – Feature dimension of the input Tensor.
- dropout (float) – unused
- max_len (int) – unused
- kernel_size (int) – The number of frames to be use.
- groups (int) – The number of groups in feature dimensions.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward(x)
- Parameters:x (Tensor) – shape
[batch, frame, feature]
. - Returns: The resulting feature. Shape
[batch, frame, feature]
. - Return type: Tensor