espnet2.enh.separator.bsrnn_separator.BSRNNSeparator
espnet2.enh.separator.bsrnn_separator.BSRNNSeparator
class espnet2.enh.separator.bsrnn_separator.BSRNNSeparator(input_dim: int, num_spk: int = 1, num_channels: int = 16, num_layers: int = 6, target_fs: int = 48000, causal: bool = True, norm_type: str = 'GN', ref_channel: int | None = None)
Bases: AbsSeparator
Band-split RNN (BSRNN) separator.
Reference: : [1] J. Yu, H. Chen, Y. Luo, R. Gu, and C. Weng, “High fidelity speech enhancement with band-split RNN,” in Proc. ISCA Interspeech, 2023. https://isca-speech.org/archive/interspeech_2023/yu23b_interspeech.html [2] J. Yu, and Y. Luo, “Efficient monaural speech enhancement with universal sample rate band-split RNN,” in Proc. ICASSP, 2023. https://ieeexplore.ieee.org/document/10096020
- Parameters:
- input_dim – (int) maximum number of frequency bins corresponding to target_fs
- num_spk – (int) number of speakers.
- num_channels – (int) feature dimension in the BandSplit block.
- num_layers – (int) number of processing layers.
- target_fs – (int) max sampling frequency that the model can handle.
- causal (bool) – whether or not to apply causal modeling. if True, LSTM will be used instead of BLSTM for time modeling
- norm_type (str) – type of the normalization layer (cfLN / cLN / BN / GN).
- ref_channel – (int) reference channel. not used for now.
forward(input: Tensor | ComplexTensor, ilens: Tensor, additional: Dict | None = None) → Tuple[List[Tensor | ComplexTensor], Tensor, OrderedDict]
BSRNN Forward.
Parameters:
- input (torch.Tensor or ComplexTensor) – STFT spectrum [B, T, (C,) F (,2)]
- ilens (torch.Tensor) – input lengths [Batch]
- additional (Dict or None) – other data included in model. unused in this model.
Returns: [(B, T, F), …] ilens (torch.Tensor): (B,) others predicted data, e.g. masks: OrderedDict[
’mask_spk1’: torch.Tensor(Batch, Frames, Freq), ‘mask_spk2’: torch.Tensor(Batch, Frames, Freq), … ‘mask_spkn’: torch.Tensor(Batch, Frames, Freq),
]
Return type: masked (List[Union(torch.Tensor, ComplexTensor)])
property num_spk