espnet2.ssl.loss.hubert.HuBERTDecoder
Less than 1 minute
espnet2.ssl.loss.hubert.HuBERTDecoder
class espnet2.ssl.loss.hubert.HuBERTDecoder(encoder_embed_dim: int, num_classes: int, final_dim: int)
Bases: Module
Generate the logits of masked and unmasked inputs.
- Parameters:
- encoder_embed_dim (int) – The dimension of the transformer embedding output.
- num_classes (int) – The number of classes in the labels.
- final_dim (int) – Project final representations and targets to final_dim.
forward(x: Tensor, mask_m: Tensor, mask_u: Tensor) → Tuple[Tensor, Tensor]
HuBERTDecoder forward.
- Parameters:
- x (Tensor) – The feature representation of the last transformer layer.
- mask_m (Tensor) – The masked indices of dimension [batch, frame].
- mask_u (Tensor) – The unmasked indices of dimension [batch, frame].
- Returns: The logits of masked frames. [masked_frame, final_dim]. Tensor: The logits of unmasked frames. [unmasked_frame, final_dim].
- Return type: Tensor