espnet2.enh.decoder.conv_decoder.ConvDecoder

Less than 1 minute

espnet2.enh.decoder.conv_decoder.ConvDecoder

source

class espnet2.enh.decoder.conv_decoder.ConvDecoder(channel: int, kernel_size: int, stride: int)

Bases: AbsDecoder

Transposed Convolutional decoder for speech enhancement and separation

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(input: Tensor, ilens: Tensor, fs: int | None = None)

Forward.

Parameters:
- input (torch.Tensor) – spectrum [Batch, T, F]
- ilens (torch.Tensor) – input lengths [Batch]
- fs (int) – sampling rate in Hz (Not used)

forward_streaming(input_frame: Tensor)

streaming_merge(chunks: Tensor, ilens: tensor | None = None)

Stream Merge.

It merges the frame-level processed audio chunks in the streaming simulation. It is noted that, in real applications, the processed audio should be sent to the output channel frame by frame. You may refer to this function to manage your streaming output buffer.

Parameters:
- chunks – List [(B, frame_size),]
- ilens – [B]
Returns: [B, T]
Return type: merge_audio