espnet2.layers.augmentation.time_stretch
Less than 1 minute
espnet2.layers.augmentation.time_stretch
espnet2.layers.augmentation.time_stretch(waveform, sample_rate: int, factor: float, n_fft: float = 0.032, win_length: float | None = None, hop_length: float = 0.008, window: str | None = 'hann')
Time scaling (speed up in time without modifying pitch) via phase vocoder.
Note: This function should be used with caution as it changes the signal duration.
- Parameters:
- waveform (torch.Tensor) – audio signal (…, time)
- sample_rate (int) – sampling rate in Hz
- factor (float) – speed-up factor (e.g., 0.9 for 90% speed and 1.3 for 130% speed)
- n_fft (float) – length of FFT (in second)
- win_length (float or None) – The window length (in second) used for STFT If None, it is treated as equal to n_fft
- hop_length (float) – The hop size (in second) used for STFT
- window (str or None) – The windowing function applied to the signal after padding with zeros
- Returns: perturbed signal (…, time)
- Return type: ret (torch.Tensor)