espnet2.gan_codec.shared.encoder.seanet.SEANetEncoder
espnet2.gan_codec.shared.encoder.seanet.SEANetEncoder
class espnet2.gan_codec.shared.encoder.seanet.SEANetEncoder(channels: int = 1, dimension: int = 128, n_filters: int = 32, n_residual_layers: int = 1, ratios: List[int] = [8, 5, 4, 2], activation: str = 'ELU', activation_params: dict = {'alpha': 1.0}, norm: str = 'weight_norm', norm_params: Dict[str, Any] = {}, kernel_size: int = 7, last_kernel_size: int = 7, residual_kernel_size: int = 3, dilation_base: int = 2, causal: bool = False, pad_mode: str = 'reflect', true_skip: bool = False, compress: int = 2, lstm: int = 2)
Bases: Module
SEANet encoder. :param channels: Audio channels. :type channels: int :param dimension: Intermediate representation dimension. :type dimension: int :param n_filters: Base width for the model. :type n_filters: int :param n_residual_layers: nb of residual layers. :type n_residual_layers: int :param ratios: kernel size and stride ratios. The encoder
uses downsampling ratios instead of upsampling ratios, hence it will use the ratios in the reverse order to the ones specified here that must match the decoder order
- Parameters:
- activation (str) – Activation function.
- activation_params (dict) – Parameters to provide to the activation function
- norm (str) – Normalization method.
- norm_params (dict) – Parameters to provide to the underlying normalization used along with the convolution.
- kernel_size (int) – Kernel size for the initial convolution.
- last_kernel_size (int) – Kernel size for the initial convolution.
- residual_kernel_size (int) – Kernel size for the residual layers.
- dilation_base (int) – How much to increase the dilation with each layer.
- causal (bool) – Whether to use fully causal convolution.
- pad_mode (str) – Padding mode for the convolutions.
- true_skip (bool) – Whether to use true skip connection or a simple (streamable) convolution as the skip connection in the residual network blocks.
- compress (int) – Reduced dimensionality in residual branches (from Demucs v3).
- lstm (int) – Number of LSTM layers at the end of the encoder.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
NOTE
Although the recipe for forward pass needs to be defined within this function, one should call the Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.