espnet2.layers.sinc_conv.SincConv

About 1 min

espnet2.layers.sinc_conv.SincConv

class espnet2.layers.sinc_conv.SincConv(in_channels: int, out_channels: int, kernel_size: int, stride: int = 1, padding: int = 0, dilation: int = 1, window_func: str = 'hamming', scale_type: str = 'mel', fs: int | float = 16000)

Bases: Module

Sinc Convolution.

This module performs a convolution using Sinc filters in time domain as kernel. Sinc filters function as band passes in spectral domain. The filtering is done as a convolution in time domain, and no transformation to spectral domain is necessary.

This implementation of the Sinc convolution is heavily inspired by Ravanelli et al. https://github.com/mravanelli/SincNet, and adapted for the ESpnet toolkit. Combine Sinc convolutions with a log compression activation function, as in: https://arxiv.org/abs/2010.07597

Notes: Currently, the same filters are applied to all input channels. The windowing function is applied on the kernel to obtained a smoother filter, and not on the input values, which is different to traditional ASR.

Initialize Sinc convolutions.

Parameters:
- in_channels – Number of input channels.
- out_channels – Number of output channels.
- kernel_size – Sinc filter kernel size (needs to be an odd number).
- stride – See torch.nn.functional.conv1d.
- padding – See torch.nn.functional.conv1d.
- dilation – See torch.nn.functional.conv1d.
- window_func – Window function on the filter, one of [“hamming”, “none”].
- fs (str , int , float) – Sample rate of the input data

forward(xs: Tensor) → Tensor

Sinc convolution forward function.

Parameters:xs – Batch in form of torch.Tensor (B, C_in, D_in).
Returns: Batch in form of torch.Tensor (B, C_out, D_out).
Return type: xs

get_odim(idim: int) → int

Obtain the output dimension of the filter.

static hamming_window(x: Tensor) → Tensor

Hamming Windowing function.

init_filters()

Initialize filters with filterbank values.

static none_window(x: Tensor) → Tensor

Identity-like windowing function.

static sinc(x: Tensor) → Tensor

Sinc function.