espnet2.layers.augmentation.codecs
espnet2.layers.augmentation.codecs
espnet2.layers.augmentation.codecs(waveform, sample_rate: int, format: str, compression: float | None = None, encoding: str | None = None, bits_per_sample: int | None = None)
Apply the specified codecs to the input signal.
Warning: Wait until torchaudio 2.1 for this function to work.
NOTE
- This function only supports CPU backend.
- The GSM codec can be used to emulate phone line channel effects.
- Parameters:
waveform (torch.Tensor) – audio signal (…, time)
sample_rate (int) – sampling rate in Hz
format (str) – file format. Valid values are “wav”, “mp3”, “ogg”, “vorbis”, “amr-nb”, “amb”, “flac”, “sph”, “gsm”, and “htk”.
compression (float or None , optional) –
used for formats other than WAV
For more details see torchaudio.backend.sox_io_backend.save().
encoding (str or None , optional) – change the encoding for the supported formats Valid values are “PCM_S” (signed integer Linear PCM), “PCM_U” (unsigned integer Linear PCM), “PCM_F” (floating point PCM), “ULAW” (mu-law), and “ALAW” (a-law). For more details see torchaudio.backend.sox_io_backend.save().
bits_per_sample (int or None , optional) – change the bit depth for the supported formats For more details see torchaudio.backend.sox_io_backend.save().
- Returns: compressed signal (…, time)
- Return type: ret (torch.Tensor)