espnet2.tts.fastspeech2.variance_predictor.VariancePredictor

Less than 1 minute

espnet2.tts.fastspeech2.variance_predictor.VariancePredictor

class espnet2.tts.fastspeech2.variance_predictor.VariancePredictor(idim: int, n_layers: int = 2, n_chans: int = 384, kernel_size: int = 3, bias: bool = True, dropout_rate: float = 0.5)

Bases: Module

Variance predictor module.

This is a module of variacne predictor described in FastSpeech 2: Fast and High-Quality End-to-End Text to Speech.

Initilize duration predictor module.

Parameters:
- idim (int) – Input dimension.
- n_layers (int) – Number of convolutional layers.
- n_chans (int) – Number of channels of convolutional layers.
- kernel_size (int) – Kernel size of convolutional layers.
- dropout_rate (float) – Dropout rate.

forward(xs: Tensor, x_masks: Tensor | None = None) → Tensor

Calculate forward propagation.

Parameters:
- xs (Tensor) – Batch of input sequences (B, Tmax, idim).
- x_masks (ByteTensor) – Batch of masks indicating padded part (B, Tmax).
Returns: Batch of predicted sequences (B, Tmax, 1).
Return type: Tensor