espnet2.gan_tts.wavenet.wavenet.WaveNet
espnet2.gan_tts.wavenet.wavenet.WaveNet
class espnet2.gan_tts.wavenet.wavenet.WaveNet(in_channels: int = 1, out_channels: int = 1, kernel_size: int = 3, layers: int = 30, stacks: int = 3, base_dilation: int = 2, residual_channels: int = 64, aux_channels: int = -1, gate_channels: int = 128, skip_channels: int = 64, global_channels: int = -1, dropout_rate: float = 0.0, bias: bool = True, use_weight_norm: bool = True, use_first_conv: bool = False, use_last_conv: bool = False, scale_residual: bool = False, scale_skip_connect: bool = False)
Bases: Module
WaveNet with global conditioning.
Initialize WaveNet module.
- Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Kernel size of dilated convolution.
- layers (int) – Number of residual block layers.
- stacks (int) – Number of stacks i.e., dilation cycles.
- base_dilation (int) – Base dilation factor.
- residual_channels (int) – Number of channels in residual conv.
- gate_channels (int) – Number of channels in gated conv.
- skip_channels (int) – Number of channels in skip conv.
- aux_channels (int) – Number of channels for local conditioning feature.
- global_channels (int) – Number of channels for global conditioning feature.
- dropout_rate (float) – Dropout rate. 0.0 means no dropout applied.
- bias (bool) – Whether to use bias parameter in conv layer.
- use_weight_norm (bool) – Whether to use weight norm. If set to true, it will be applied to all of the conv layers.
- use_first_conv (bool) – Whether to use the first conv layers.
- use_last_conv (bool) – Whether to use the last conv layers.
- scale_residual (bool) – Whether to scale the residual outputs.
- scale_skip_connect (bool) – Whether to scale the skip connection outputs.
apply_weight_norm()
Apply weight normalization module from all of the layers.
forward(x: Tensor, x_mask: Tensor | None = None, c: Tensor | None = None, g: Tensor | None = None) → Tensor
Calculate forward propagation.
- Parameters:
- x (Tensor) – Input noise signal (B, 1, T) if use_first_conv else (B, residual_channels, T).
- x_mask (Optional *[*Tensor ]) – Mask tensor (B, 1, T).
- c (Optional *[*Tensor ]) – Local conditioning features (B, aux_channels, T).
- g (Optional *[*Tensor ]) – Global conditioning features (B, global_channels, 1).
- Returns: Output tensor (B, out_channels, T) if use_last_conv else : (B, residual_channels, T).
- Return type: Tensor
property receptive_field_size : int
Return receptive field size.
remove_weight_norm()
Remove weight normalization module from all of the layers.