espnet2.gan_svs.visinger2.visinger2_vocoder.VISinger2VocoderGenerator

About 1 min

espnet2.gan_svs.visinger2.visinger2_vocoder.VISinger2VocoderGenerator

class espnet2.gan_svs.visinger2.visinger2_vocoder.VISinger2VocoderGenerator(in_channels: int = 80, out_channels: int = 1, channels: int = 512, global_channels: int = -1, kernel_size: int = 7, upsample_scales: List[int] = [8, 8, 2, 2], upsample_kernel_sizes: List[int] = [16, 16, 4, 4], resblock_kernel_sizes: List[int] = [3, 7, 11], resblock_dilations: List[List[int]] = [[1, 3, 5], [1, 3, 5], [1, 3, 5]], n_harmonic: int = 64, use_additional_convs: bool = True, bias: bool = True, nonlinear_activation: str = 'LeakyReLU', nonlinear_activation_params: Dict[str, Any] = {'negative_slope': 0.1}, use_weight_norm: bool = True)

Bases: Module

Initialize HiFiGANGenerator module.

Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- channels (int) – Number of hidden representation channels.
- global_channels (int) – Number of global conditioning channels.
- kernel_size (int) – Kernel size of initial and final conv layer.
- upsample_scales (List *[*int ]) – List of upsampling scales.
- upsample_kernel_sizes (List *[*int ]) – List of kernel sizes for upsample layers.
- resblock_kernel_sizes (List *[*int ]) – List of kernel sizes for residual blocks.
- resblock_dilations (List *[*List *[*int ] ]) – List of list of dilations for residual blocks.
- n_harmonic (int) – Number of harmonics used to synthesize a sound signal.
- use_additional_convs (bool) – Whether to use additional conv layers in residual blocks.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- use_weight_norm (bool) – Whether to use weight norm. If set to true, it will be applied to all of the conv layers.

apply_weight_norm()

Apply weight normalization module from all of the layers.

forward(c, ddsp, g: Tensor | None = None) → Tensor

Calculate forward propagation.

Parameters:
- c (Tensor) – Input tensor (B, in_channels, T).
- ddsp (Tensor) – Input tensor (B, n_harmonic + 2, T * hop_length).
- g (Optional *[*Tensor ]) – Global conditioning tensor (B, global_channels, 1).
Returns: Output tensor (B, out_channels, T).
Return type: Tensor

remove_weight_norm()

Remove weight normalization module from all of the layers.

reset_parameters()

Reset parameters.

This initialization follows the official implementation manner. https://github.com/jik876/hifi-gan/blob/master/models.py