espnet2.enh.layers.beamformer.get_rank1_mwf_vector
espnet2.enh.layers.beamformer.get_rank1_mwf_vector
espnet2.enh.layers.beamformer.get_rank1_mwf_vector(psd_speech, psd_noise, reference_vector: Tensor | int, denoising_weight: float = 1.0, approx_low_rank_psd_speech: bool = False, iterations: int = 3, diagonal_loading: bool = True, diag_eps: float = 1e-07, eps: float = 1e-08)
Return the R1-MWF (Rank-1 Multi-channel Wiener Filter) vector
h = (Npsd^-1 @ Spsd) / (mu + Tr(Npsd^-1 @ Spsd)) @ u
Reference: : [1] Rank-1 constrained multichannel Wiener filter for speech recognition in noisy environments; Z. Wang et al, 2018 https://hal.inria.fr/hal-01634449/document [2] Low-rank approximation based multichannel Wiener filter algorithms for noise reduction with application in cochlear implants; R. Serizel, 2014 https://ieeexplore.ieee.org/document/6730918
- Parameters:
- psd_speech (torch.complex64/ComplexTensor) – speech covariance matrix (…, F, C, C)
- psd_noise (torch.complex64/ComplexTensor) – noise covariance matrix (…, F, C, C)
- reference_vector (torch.Tensor or int) – (…, C) or scalar
- denoising_weight (float) – a trade-off parameter between noise reduction and speech distortion. A larger value leads to more noise reduction at the expense of more speech distortion. When denoising_weight = 0, it corresponds to MVDR beamformer.
- approx_low_rank_psd_speech (bool) – whether to replace original input psd_speech with its low-rank approximation as in [1]
- iterations (int) – number of iterations in power method, only used when approx_low_rank_psd_speech = True
- diagonal_loading (bool) – Whether to add a tiny term to the diagonal of psd_n
- diag_eps (float)
- eps (float)
- Returns: (…, F, C)
- Return type: beamform_vector (torch.complex64/ComplexTensor)