espnet2.beats.tokenizer.BeatsRandomTokenizer
Less than 1 minute
espnet2.beats.tokenizer.BeatsRandomTokenizer
class espnet2.beats.tokenizer.BeatsRandomTokenizer(tokenizer_config: Dict | None = None, fbank_mean: float = 15.2913, fbank_std: float = 5.90532)
Bases: Module
Initialize internal Module state, shared by both nn.Module and ScriptModule.
encode(xs_pad: Tensor, ilens: Tensor, waveform_input: bool = True)
forward(xs_pad: Tensor, ilens: Tensor, waveform_input: bool = True)
Tokenize input audio into BEATs codes.
- Parameters:
- xs_pad (torch.Tensor) – Input tensor (B, T) or (B,T,D). (B,T) for raw waveform and (B,T,D) for features.
- ilens (torch.Tensor) – Input length tensor (B,).
- waveform_input (bool) – If True, input is raw waveform.
forward_padding_mask(features: Tensor, padding_mask: Tensor) → Tensor
Forward padding mask. Featuires: BTC, padding_mask: BT.
