espnet2.gan_codec.shared.loss.freq_loss.MultiScaleMelSpectrogramLoss
Less than 1 minute
espnet2.gan_codec.shared.loss.freq_loss.MultiScaleMelSpectrogramLoss
class espnet2.gan_codec.shared.loss.freq_loss.MultiScaleMelSpectrogramLoss(fs: int = 22050, range_start: int = 6, range_end: int = 11, window: str = 'hann', n_mels: int = 80, fmin: int | None = 0, fmax: int | None = None, center: bool = True, normalized: bool = False, onesided: bool = True, log_base: float | None = 10.0, alphas: bool = True)
Bases: Module
Multi-Scale spectrogram loss.
- Parameters:
- fs (int) – Sampling rate.
- range_start (int) – Power of 2 to use for the first scale.
- range_stop (int) – Power of 2 to use for the last scale.
- window (str) – Window type.
- n_mels (int) – Number of mel bins.
- fmin (Optional *[*int ]) – Minimum frequency for Mel.
- fmax (Optional *[*int ]) – Maximum frequency for Mel.
- center (bool) – Whether to use center window.
- normalized (bool) – Whether to use normalized one.
- onesided (bool) – Whether to use oneseded one.
- log_base (Optional *[*float ]) – Log base value.
- alphas (bool) – Whether to use alphas as coefficients or not..
Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward(y_hat: Tensor, y: Tensor) → Tensor
Calculate Mel-spectrogram loss.
- Parameters:
- y_hat (Tensor) – Generated waveform tensor (B, 1, T).
- y (Tensor) – Groundtruth waveform tensor (B, 1, T).
- Returns: Mel-spectrogram loss value.
- Return type: Tensor