espnet.nets.pytorch_backend.frontends.feature_transform.LogMel
Less than 1 minute
espnet.nets.pytorch_backend.frontends.feature_transform.LogMel
class espnet.nets.pytorch_backend.frontends.feature_transform.LogMel(fs: int = 16000, n_fft: int = 512, n_mels: int = 80, fmin: float = 0.0, fmax: float | None = None, htk: bool = False, norm=1)
Bases: Module
Convert STFT to fbank feats.
The arguments is same as librosa.filters.mel
- Parameters:
- fs – number > 0 [scalar] sampling rate of the incoming signal
- n_fft – int > 0 [scalar] number of FFT components
- n_mels – int > 0 [scalar] number of Mel bands to generate
- fmin – float >= 0 [scalar] lowest frequency (in Hz)
- fmax – float >= 0 [scalar] highest frequency (in Hz). If None, use fmax = fs / 2.0
- htk – use HTK formula instead of Slaney
- norm – {None, 1, np.inf} [scalar] if 1, divide the triangular mel weights by the width of the mel band (area normalization). Otherwise, leave all the triangles aiming for a peak value of 1.0
Initialize LogMel.
extra_repr()
Append an extra string representation.
forward(feat: Tensor, ilens: LongTensor) → Tuple[Tensor, LongTensor]
Calculate Logmel forward propagation.