espnet2.ssl.loss.hubert.HuBERTDecoder
Less than 1 minute
espnet2.ssl.loss.hubert.HuBERTDecoder
class espnet2.ssl.loss.hubert.HuBERTDecoder(encoder_embed_dim: int, num_classes: int, final_dim: int)
Bases: Module
Generate the logits of masked and unmasked inputs. :param encoder_embed_dim: The dimension of the transformer embedding output. :type encoder_embed_dim: int :param num_classes: The number of classes in the labels. :type num_classes: int :param final_dim: Project final representations and targets to final_dim. :type final_dim: int
forward(x: Tensor, mask_m: Tensor, mask_u: Tensor) → Tuple[Tensor, Tensor]
- Parameters:
- x (Tensor) – The feature representation of the last transformer layer.
- mask_m (Tensor) – The masked indices of dimension [batch, frame].
- mask_u (Tensor) – The unmasked indices of dimension [batch, frame].
- Returns: The logits of masked frames. [masked_frame, final_dim]. Tensor: The logits of unmasked frames. [unmasked_frame, final_dim].
- Return type: Tensor