espnet2.enh.separator.dpcl_e2e_separator.DPCLE2ESeparator
espnet2.enh.separator.dpcl_e2e_separator.DPCLE2ESeparator
class espnet2.enh.separator.dpcl_e2e_separator.DPCLE2ESeparator(input_dim: int, rnn_type: str = 'blstm', num_spk: int = 2, predict_noise: bool = False, nonlinear: str = 'tanh', layer: int = 2, unit: int = 512, emb_D: int = 40, dropout: float = 0.0, alpha: float = 5.0, max_iteration: int = 500, threshold: float = 1e-05)
Bases: AbsSeparator
Deep Clustering End-to-End Separator
References
Single-Channel Multi-Speaker Separation using Deep Clustering; Yusuf Isik. et al., 2016; https://www.isca-speech.org/archive/interspeech_2016/isik16_interspeech.html
- Parameters:
- input_dim – input feature dimension
 - rnn_type – string, select from ‘blstm’, ‘lstm’ etc.
 - bidirectional – bool, whether the inter-chunk RNN layers are bidirectional.
 - num_spk – number of speakers
 - predict_noise – whether to output the estimated noise signal
 - nonlinear – the nonlinear function for mask estimation, select from ‘relu’, ‘tanh’, ‘sigmoid’
 - layer – int, number of stacked RNN layers. Default is 3.
 - unit – int, dimension of the hidden state.
 - emb_D – int, dimension of the feature vector for a tf-bin.
 - dropout – float, dropout ratio. Default is 0.
 - alpha – float, the clustering hardness parameter.
 - max_iteration – int, the max iterations of soft kmeans.
 - threshold – float, the threshold to end the soft k-means process.
 
 
forward(input: Tensor | ComplexTensor, ilens: Tensor, additional: Dict | None = None) → Tuple[List[Tensor | ComplexTensor], Tensor, OrderedDict]
Forward.
Parameters:
- input (torch.Tensor or ComplexTensor) – Encoded feature [B, T, F]
 - ilens (torch.Tensor) – input lengths [Batch]
 
Returns: [(B, T, N), …] ilens (torch.Tensor): (B,) others predicted data, e.g. V: OrderedDict[
others predicted data, e.g. masks: OrderedDict[ ‘mask_spk1’: torch.Tensor(Batch, Frames, Freq), ‘mask_spk2’: torch.Tensor(Batch, Frames, Freq), … ‘mask_spkn’: torch.Tensor(Batch, Frames, Freq),
]
Return type: masked (List[Union(torch.Tensor, ComplexTensor)])
property num_spk
