espnet2.gan_tts.hifigan.loss.MelSpectrogramLoss
Less than 1 minute
espnet2.gan_tts.hifigan.loss.MelSpectrogramLoss
class espnet2.gan_tts.hifigan.loss.MelSpectrogramLoss(fs: int = 22050, n_fft: int = 1024, hop_length: int = 256, win_length: int | None = None, window: str = 'hann', n_mels: int = 80, fmin: int | None = 0, fmax: int | None = None, center: bool = True, normalized: bool = False, onesided: bool = True, log_base: float | None = 10.0)
Bases: Module
Mel-spectrogram loss.
Initialize Mel-spectrogram loss.
- Parameters:
- fs (int) – Sampling rate.
- n_fft (int) – FFT points.
- hop_length (int) – Hop length.
- win_length (Optional *[*int ]) – Window length.
- window (str) – Window type.
- n_mels (int) – Number of Mel basis.
- fmin (Optional *[*int ]) – Minimum frequency for Mel.
- fmax (Optional *[*int ]) – Maximum frequency for Mel.
- center (bool) – Whether to use center window.
- normalized (bool) – Whether to use normalized one.
- onesided (bool) – Whether to use oneseded one.
- log_base (Optional *[*float ]) – Log base value.
forward(y_hat: Tensor, y: Tensor, spec: Tensor | None = None, use_mse: bool = False) → Tensor
Calculate Mel-spectrogram loss.
- Parameters:
- y_hat (Tensor) – Generated waveform tensor (B, 1, T).
- y (Tensor) – Groundtruth waveform tensor (B, 1, T).
- spec (Optional *[*Tensor ]) – Groundtruth linear amplitude spectrum tensor (B, T, n_fft // 2 + 1). if provided, use it instead of groundtruth waveform.
- use_l2 (bool) – Whether to use mse_loss instead of l1
- Returns: Mel-spectrogram loss value.
- Return type: Tensor