espnet2.enh.layers.dc_crn.DC_CRN

About 1 min

espnet2.enh.layers.dc_crn.DC_CRN

class espnet2.enh.layers.dc_crn.DC_CRN(input_dim, input_channels: List = [2, 16, 32, 64, 128, 256], enc_hid_channels=8, enc_kernel_size=(1, 3), enc_padding=(0, 1), enc_last_kernel_size=(1, 4), enc_last_stride=(1, 2), enc_last_padding=(0, 1), enc_layers=5, skip_last_kernel_size=(1, 3), skip_last_stride=(1, 1), skip_last_padding=(0, 1), glstm_groups=2, glstm_layers=2, glstm_bidirectional=False, glstm_rearrange=False, output_channels=2)

Bases: Module

Densely-Connected Convolutional Recurrent Network (DC-CRN).

Reference: Fig. 3 and Section III-B in [1]

Parameters:
- input_dim (int) – input feature dimension
- input_channels (list) – number of input channels for the stacked DenselyConnectedBlock layers Its length should be (number of DenselyConnectedBlock layers). It is recommended to use even number of channels to avoid AssertError when glstm_bidirectional=True.
- enc_hid_channels (int) – common number of intermediate channels for all DenselyConnectedBlock of the encoder
- enc_kernel_size (tuple) – common kernel size for all DenselyConnectedBlock of the encoder
- enc_padding (tuple) – common padding for all DenselyConnectedBlock of the encoder
- enc_last_kernel_size (tuple) – common kernel size for the last Conv layer in all DenselyConnectedBlock of the encoder
- enc_last_stride (tuple) – common stride for the last Conv layer in all DenselyConnectedBlock of the encoder
- enc_last_padding (tuple) – common padding for the last Conv layer in all DenselyConnectedBlock of the encoder
- enc_layers (int) – common total number of Conv layers for all DenselyConnectedBlock layers of the encoder
- skip_last_kernel_size (tuple) – common kernel size for the last Conv layer in all DenselyConnectedBlock of the skip pathways
- skip_last_stride (tuple) – common stride for the last Conv layer in all DenselyConnectedBlock of the skip pathways
- skip_last_padding (tuple) – common padding for the last Conv layer in all DenselyConnectedBlock of the skip pathways
- glstm_groups (int) – number of groups in each Grouped LSTM layer
- glstm_layers (int) – number of Grouped LSTM layers
- glstm_bidirectional (bool) – whether to use BLSTM or unidirectional LSTM in Grouped LSTM layers
- glstm_rearrange (bool) – whether to apply the rearrange operation after each grouped LSTM layer
- output_channels (int) – number of output channels (must be an even number to recover both real and imaginary parts)

forward(x)

DC-CRN forward.

Parameters:x (torch.Tensor) – Concatenated real and imaginary spectrum features (B, input_channels[0], T, F)
Returns: (B, 2, output_channels, T, F)
Return type: out (torch.Tensor)