espnet2.gan_codec.soundstream.soundstream.SoundStreamDiscriminator
espnet2.gan_codec.soundstream.soundstream.SoundStreamDiscriminator
class espnet2.gan_codec.soundstream.soundstream.SoundStreamDiscriminator(scales: int = 3, scale_downsample_pooling: str = 'AvgPool1d', scale_downsample_pooling_params: Dict[str, Any] = {'kernel_size': 4, 'padding': 2, 'stride': 2}, scale_discriminator_params: Dict[str, Any] = {'bias': True, 'channels': 128, 'downsample_scales': [2, 2, 4, 4, 1], 'in_channels': 1, 'kernel_sizes': [15, 41, 5, 3], 'max_downsample_channels': 1024, 'max_groups': 16, 'nonlinear_activation': 'LeakyReLU', 'nonlinear_activation_params': {'negative_slope': 0.1}, 'out_channels': 1}, scale_follow_official_norm: bool = False, complexstft_discriminator_params: Dict[str, Any] = {'chan_mults': [1, 2, 4, 4, 8, 8], 'channels': 32, 'hop_length': 256, 'in_channels': 1, 'n_fft': 1024, 'stft_normalized': False, 'strides': [[1, 2], [2, 2], [1, 2], [2, 2], [1, 2], [2, 2]], 'win_length': 1024})
Bases: Module
SoundStream discriminator module.
Initialize SoundStream Discriminator module.
- Parameters:
- scales (int) – Number of multi-scales.
- sclae_downsample_pooling (str) – Pooling module name for downsampling of the inputs.
- scale_downsample_pooling_params (Dict *[*str , Any ]) – Parameters for the above pooling module.
- scale_discriminator_params (Dict *[*str , Any ]) – Parameters for hifi-gan scale discriminator module.
- scale_follow_official_norm (bool) – Whether to follow the norm setting of the official implementaion. The first discriminator uses spectral norm and the other discriminators use weight norm.
- complexstft_discriminator_params (Dict *[*str , Any ]) – Parameters for the complex stft discriminator module.
forward(x: Tensor) → List[List[Tensor]]
Calculate forward propagation.
- Parameters:x (Tensor) – Input noise signal (B, 1, T).
- Returns: List of list of each discriminator outputs, : which consists of each layer output tensors. Multi scale and multi period ones are concatenated.
- Return type: List[List[Tensor]]