espnet2.gan_tts.melgan.melgan.MelGANGenerator
espnet2.gan_tts.melgan.melgan.MelGANGenerator
class espnet2.gan_tts.melgan.melgan.MelGANGenerator(in_channels: int = 80, out_channels: int = 1, kernel_size: int = 7, channels: int = 512, bias: bool = True, upsample_scales: List[int] = [8, 8, 2, 2], stack_kernel_size: int = 3, stacks: int = 3, nonlinear_activation: str = 'LeakyReLU', nonlinear_activation_params: Dict[str, Any] = {'negative_slope': 0.2}, pad: str = 'ReflectionPad1d', pad_params: Dict[str, Any] = {}, use_final_nonlinear_activation: bool = True, use_weight_norm: bool = True)
Bases: Module
MelGAN generator module.
Initialize MelGANGenerator module.
- Parameters:
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Kernel size of initial and final conv layer.
- channels (int) – Initial number of channels for conv layer.
- bias (bool) – Whether to add bias parameter in convolution layers.
- upsample_scales (List *[*int ]) – List of upsampling scales.
- stack_kernel_size (int) – Kernel size of dilated conv layers in residual stack.
- stacks (int) – Number of stacks in a single residual stack.
- nonlinear_activation (str) – Activation function module name.
- nonlinear_activation_params (Dict *[*str , Any ]) – Hyperparameters for activation function.
- pad (str) – Padding function module name before dilated convolution layer.
- pad_params (Dict *[*str , Any ]) – Hyperparameters for padding function.
- use_final_nonlinear_activation (torch.nn.Module) – Activation function for the final layer.
- use_weight_norm (bool) – Whether to use weight norm. If set to true, it will be applied to all of the conv layers.
apply_weight_norm()
Apply weight normalization module from all of the layers.
forward(c: Tensor) → Tensor
Calculate forward propagation.
- Parameters:c (Tensor) – Input tensor (B, channels, T).
- Returns: Output tensor (B, 1, T ** prod(upsample_scales)).
- Return type: Tensor
inference(c: Tensor) → Tensor
Perform inference.
- Parameters:c (Tensor) – Input tensor (T, in_channels).
- Returns: Output tensor (T ** prod(upsample_scales), out_channels).
- Return type: Tensor
remove_weight_norm()
Remove weight normalization module from all of the layers.
reset_parameters()
Reset parameters.
This initialization follows official implementation manner. https://github.com/descriptinc/melgan-neurips/blob/master/mel2wav/modules.py