espnet2.gan_tts.hifigan.hifigan.HiFiGANMultiScaleMultiPeriodDiscriminator
espnet2.gan_tts.hifigan.hifigan.HiFiGANMultiScaleMultiPeriodDiscriminator
class espnet2.gan_tts.hifigan.hifigan.HiFiGANMultiScaleMultiPeriodDiscriminator(scales: int = 3, scale_downsample_pooling: str = 'AvgPool1d', scale_downsample_pooling_params: Dict[str, Any] = {'kernel_size': 4, 'padding': 2, 'stride': 2}, scale_discriminator_params: Dict[str, Any] = {'bias': True, 'channels': 128, 'downsample_scales': [2, 2, 4, 4, 1], 'in_channels': 1, 'kernel_sizes': [15, 41, 5, 3], 'max_downsample_channels': 1024, 'max_groups': 16, 'nonlinear_activation': 'LeakyReLU', 'nonlinear_activation_params': {'negative_slope': 0.1}, 'out_channels': 1}, follow_official_norm: bool = True, periods: List[int] = [2, 3, 5, 7, 11], period_discriminator_params: Dict[str, Any] = {'bias': True, 'channels': 32, 'downsample_scales': [3, 3, 3, 3, 1], 'in_channels': 1, 'kernel_sizes': [5, 3], 'max_downsample_channels': 1024, 'nonlinear_activation': 'LeakyReLU', 'nonlinear_activation_params': {'negative_slope': 0.1}, 'out_channels': 1, 'use_spectral_norm': False, 'use_weight_norm': True})
Bases: Module
HiFi-GAN multi-scale + multi-period discriminator module.
Initilize HiFiGAN multi-scale + multi-period discriminator module.
- Parameters:
- scales (int) – Number of multi-scales.
- scale_downsample_pooling (str) – Pooling module name for downsampling of the inputs.
- scale_downsample_pooling_params (dict) – Parameters for the above pooling module.
- scale_discriminator_params (dict) – Parameters for hifi-gan scale discriminator module.
- follow_official_norm (bool) – Whether to follow the norm setting of the official implementaion. The first discriminator uses spectral norm and the other discriminators use weight norm.
- periods (list) – List of periods.
- period_discriminator_params (dict) – Parameters for hifi-gan period discriminator module. The period parameter will be overwritten.
forward(x: Tensor) → List[List[Tensor]]
Calculate forward propagation.
- Parameters:x (Tensor) – Input noise signal (B, 1, T).
- Returns: List of list of each discriminator outputs, : which consists of each layer output tensors. Multi scale and multi period ones are concatenated.
- Return type: List[List[Tensor]]