espnet2.gan_codec.hificodec.hificodec.HiFiCodecDiscriminator
espnet2.gan_codec.hificodec.hificodec.HiFiCodecDiscriminator
class espnet2.gan_codec.hificodec.hificodec.HiFiCodecDiscriminator(msstft_discriminator_params: Dict[str, Any] = {'activation': 'LeakyReLU', 'activation_params': {'negative_slope': 0.2}, 'filters': 32, 'hop_lengths': [256, 512, 128, 64, 32], 'in_channels': 1, 'n_ffts': [1024, 2048, 512, 256, 128], 'norm': 'weight_norm', 'out_channels': 1, 'win_lengths': [1024, 2048, 512, 256, 128]}, scales: int = 3, scale_downsample_pooling: str = 'AvgPool1d', scale_downsample_pooling_params: Dict[str, Any] = {'kernel_size': 4, 'padding': 2, 'stride': 2}, scale_discriminator_params: Dict[str, Any] = {'bias': False, 'channels': 128, 'downsample_scales': [2, 2, 4, 4, 1], 'in_channels': 1, 'kernel_sizes': [15, 41, 5, 3], 'max_downsample_channels': 1024, 'max_groups': 16, 'nonlinear_activation': 'LeakyReLU', 'nonlinear_activation_params': {'negative_slope': 0.1}, 'out_channels': 1, 'use_spectral_norm': False, 'use_weight_norm': True}, scale_follow_official_norm: bool = False, periods: List[int] = [2, 3, 5, 7, 11], periods_discriminator_params: Dict[str, Any] = {'bias': False, 'channels': 32, 'downsample_scales': [3, 3, 3, 3, 1], 'in_channels': 1, 'kernel_sizes': [5, 3], 'max_downsample_channels': 1024, 'nonlinear_activation': 'LeakyReLU', 'nonlinear_activation_params': {'negative_slope': 0.1}, 'out_channels': 1, 'use_spectral_norm': False, 'use_weight_norm': True})
Bases: Module
HiFiCodec discriminator module.
Initialize HiFiCodec Discriminator module.
- Parameters:
- msstft_discriminator_params (Dict *[*str , Any ]) – Parameters for multi-scales STFT discriminator module.
- scales (int) – Number of multi-scales.
- sclae_downsample_pooling (str) – Pooling module name for downsampling of the inputs.
- scale_downsample_pooling_params (Dict *[*str , Any ]) – Parameters for the above pooling module.
- scale_discriminator_params (Dict *[*str , Any ]) – Parameters for hifi-gan scale discriminator module.
- periods (List *[*int ]) – List of periods.
- discriminator_params (Dict *[*str , Any ]) – Parameters for hifi-gan period discriminator module. The period parameter will be overwritten.
forward(x: Tensor) → List[List[Tensor]]
Calculate forward propagation.
- Parameters:x (Tensor) – Input noise signal (B, 1, T).
- Returns: List of list of each discriminator outputs, : which consists of each layer output tensors. Multi scale and multi period ones are concatenated.
- Return type: List[List[Tensor]]