espnet2.gan_tts.parallel_wavegan.parallel_wavegan.ParallelWaveGANGenerator

About 1 min

espnet2.gan_tts.parallel_wavegan.parallel_wavegan.ParallelWaveGANGenerator

class espnet2.gan_tts.parallel_wavegan.parallel_wavegan.ParallelWaveGANGenerator(in_channels: int = 1, out_channels: int = 1, kernel_size: int = 3, layers: int = 30, stacks: int = 3, residual_channels: int = 64, gate_channels: int = 128, skip_channels: int = 64, aux_channels: int = 80, aux_context_window: int = 2, dropout_rate: float = 0.0, bias: bool = True, use_weight_norm: bool = True, upsample_conditional_features: bool = True, upsample_net: str = 'ConvInUpsampleNetwork', upsample_params: Dict[str, Any] = {'upsample_scales': [4, 4, 4, 4]})

Bases: Module

Parallel WaveGAN Generator module.

Initialize ParallelWaveGANGenerator module.

Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Kernel size of dilated convolution.
- layers (int) – Number of residual block layers.
- stacks (int) – Number of stacks i.e., dilation cycles.
- residual_channels (int) – Number of channels in residual conv.
- gate_channels (int) – Number of channels in gated conv.
- skip_channels (int) – Number of channels in skip conv.
- aux_channels (int) – Number of channels for auxiliary feature conv.
- aux_context_window (int) – Context window size for auxiliary feature.
- dropout_rate (float) – Dropout rate. 0.0 means no dropout applied.
- bias (bool) – Whether to use bias parameter in conv layer.
- use_weight_norm (bool) – Whether to use weight norm. If set to true, it will be applied to all of the conv layers.
- upsample_conditional_features (bool) – Whether to use upsampling network.
- upsample_net (str) – Upsampling network architecture.
- upsample_params (Dict *[*str , Any ]) – Upsampling network parameters.

apply_weight_norm()

Apply weight normalization module from all of the layers.

forward(c: Tensor, z: Tensor | None = None) → Tensor

Calculate forward propagation.

Parameters:
- c (Tensor) – Local conditioning auxiliary features (B, C ,T_feats).
- z (Tensor) – Input noise signal (B, 1, T_wav).
Returns: Output tensor (B, out_channels, T_wav)
Return type: Tensor

inference(c: Tensor, z: Tensor | None = None) → Tensor

Perform inference.

Parameters:
- c (Tensor) – Local conditioning auxiliary features (T_feats ,C).
- z (Optional *[*Tensor ]) – Input noise signal (T_wav, 1).
Returns: Output tensor (T_wav, out_channels)
Return type: Tensor

property receptive_field_size

Return receptive field size.

remove_weight_norm()

Remove weight normalization module from all of the layers.