espnet2.beats.tokenizer.BeatsTokenizer
Less than 1 minute
espnet2.beats.tokenizer.BeatsTokenizer
class espnet2.beats.tokenizer.BeatsTokenizer(beats_tokenizer_ckpt_path: str | None = None, tokenizer_config: Dict | None = None, max_layer: int | None = None, use_weighted_representation: bool = False, fbank_mean: float = 15.2913, fbank_std: float = 5.90532)
Bases: BeatsEncoder
Initialize internal Module state, shared by both nn.Module and ScriptModule.
encode(xs_pad: Tensor, ilens: Tensor | None = None, waveform_input: bool = True)
Encode input audio xs_pad to quantized features.
- Parameters:
- xs_pad (torch.Tensor) – Input tensor (B, T, D) or (B,T,1).
- ilens (torch.Tensor) – Input length tensor (B,).
- waveform_input (bool) – If True, input is raw waveform.
- Returns: Embedding indices (B, T). embed_loss (torch.Tensor): Embedding loss. quantize_feature (torch.Tensor): Quantized features.
- Return type: embed_ind (torch.Tensor)
initialize_tokenizer_params()
reload_pretrained_parameters()
Initialization function for Beats.
This must be called last in the initialization procedure. The initialization occurs in three steps:
- ESPnet initializes all modules.
- This function initializes Beats encoder overriding 1.
- Optionally, if we have the pretrained checkpoint, we load the
weights from the checkpoint overriding 2 and 1.
