espnet2.layers.augmentation.codecs

Less than 1 minute

espnet2.layers.augmentation.codecs

espnet2.layers.augmentation.codecs(waveform, sample_rate: int, format: str, compression: float | None = None, encoding: str | None = None, bits_per_sample: int | None = None)

Apply the specified codecs to the input signal.

Warning: Wait until torchaudio 2.1 for this function to work.

NOTE

This function only supports CPU backend.
The GSM codec can be used to emulate phone line channel effects.

Parameters:
- waveform (torch.Tensor) – audio signal (…, time)
- sample_rate (int) – sampling rate in Hz
- format (str) – file format. Valid values are “wav”, “mp3”, “ogg”, “vorbis”, “amr-nb”, “amb”, “flac”, “sph”, “gsm”, and “htk”.
- compression (float or None , optional) –
  used for formats other than WAV
  For more details see torchaudio.backend.sox_io_backend.save().
- encoding (str or None , optional) – change the encoding for the supported formats Valid values are “PCM_S” (signed integer Linear PCM), “PCM_U” (unsigned integer Linear PCM), “PCM_F” (floating point PCM), “ULAW” (mu-law), and “ALAW” (a-law). For more details see torchaudio.backend.sox_io_backend.save().
- bits_per_sample (int or None , optional) – change the bit depth for the supported formats For more details see torchaudio.backend.sox_io_backend.save().
Returns: compressed signal (…, time)
Return type: ret (torch.Tensor)