espnet2.gan_svs.uhifigan.uhifigan.UHiFiGANGenerator
espnet2.gan_svs.uhifigan.uhifigan.UHiFiGANGenerator
class espnet2.gan_svs.uhifigan.uhifigan.UHiFiGANGenerator(in_channels=80, out_channels=1, channels=512, global_channels: int = -1, kernel_size=7, downsample_scales=(2, 2, 8, 8), downsample_kernel_sizes=(4, 4, 16, 16), upsample_scales=(8, 8, 2, 2), upsample_kernel_sizes=(16, 16, 4, 4), resblock_kernel_sizes=(3, 7, 11), resblock_dilations=[(1, 3, 5), (1, 3, 5), (1, 3, 5)], projection_filters: List[int] = [0, 1, 1, 1], projection_kernels: List[int] = [0, 5, 7, 11], dropout=0.3, use_additional_convs=True, bias=True, nonlinear_activation='LeakyReLU', nonlinear_activation_params={'negative_slope': 0.1}, use_causal_conv=False, use_weight_norm=True, use_avocodo=False)
Bases: Module
UHiFiGAN generator module.
Initialize Unet-based HiFiGANGenerator module.
- Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- channels (int) – Number of hidden representation channels.
- global_channels (int) – Number of global conditioning channels.
- kernel_size (int) – Kernel size of initial and final conv layer.
- upsample_scales (list) – List of upsampling scales.
- upsample_kernel_sizes (list) – List of kernel sizes for upsampling layers.
- resblock_kernel_sizes (list) – List of kernel sizes for residual blocks.
- resblock_dilations (list) – List of dilation list for residual blocks.
- use_additional_convs (bool) – Whether to use additional conv layers in residual blocks.
- bias (bool) – Whether to add bias parameter in convolution layers.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (dict) – Hyperparameters for activation function.
- use_causal_conv (bool) – Whether to use causal structure.
- use_weight_norm (bool) – Whether to use weight norm. If set to true, it will be applied to all of the conv layers.
apply_weight_norm()
Apply weight normalization module from all of the layers.
forward(c=None, f0=None, excitation=None, g: Tensor | None = None)
Calculate forward propagation.
- Parameters:
- c (Tensor) – Input tensor (B, in_channels, T).
- f0 (Tensor) – Input tensor (B, 1, T).
- excitation (Tensor) – Input tensor (B, frame_len, T).
- Returns: Output tensor (B, out_channels, T).
- Return type: Tensor
inference(excitation=None, f0=None, c=None, normalize_before=False)
Perform inference.
- Parameters:
- c (Union *[*Tensor , ndarray ]) – Input tensor (T, in_channels).
- normalize_before (bool) – Whether to perform normalization.
- Returns: Output tensor (T ** prod(upsample_scales), out_channels).
- Return type: Tensor
register_stats(stats)
Register stats for de-normalization as buffer.
- Parameters:stats (str) – Path of statistics file (“.npy” or “.h5”).
remove_weight_norm()
Remove weight normalization module from all of the layers.
reset_parameters()
Reset parameters.
This initialization follows the official implementation manner. https://github.com/jik876/hifi-gan/blob/master/models.py