espnet2.spk.pooling.mean_pooling.MeanPooling
Less than 1 minute
espnet2.spk.pooling.mean_pooling.MeanPooling
class espnet2.spk.pooling.mean_pooling.MeanPooling(input_size: int = 1536)
Bases: AbsPooling
Average frame-level features to a single utterance-level feature.
- Parameters:input_size – Dimension of the input frame-level embeddings.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward(x: Tensor, feat_lengths: Tensor | None = None) → Tensor
Forward pass of mean pooling.
- Parameters:
- x – Input feature tensor of shape (batch_size, feature_dim, seq_len)
- feat_lengths – Optional tensor of shape (batch_size,) containing the valid length of each sequence before padding
- Returns: Utterance-level embeddings of shape (batch_size, feature_dim)
- Return type: x
output_size()