espnet2.enh.layers.beamformer.get_sdw_mwf_vector

Less than 1 minute

espnet2.enh.layers.beamformer.get_sdw_mwf_vector

espnet2.enh.layers.beamformer.get_sdw_mwf_vector(psd_speech, psd_noise, reference_vector: Tensor | int, denoising_weight: float = 1.0, approx_low_rank_psd_speech: bool = False, iterations: int = 3, diagonal_loading: bool = True, diag_eps: float = 1e-07, eps: float = 1e-08)

Return the SDW-MWF (Speech Distortion Weighted Multi-channel Wiener Filter) vector

h = (Spsd + mu * Npsd)^-1 @ Spsd @ u

Reference: : [1] Spatially pre-processed speech distortion weighted multi-channel Wiener filtering for noise reduction; A. Spriet et al, 2004 https://dl.acm.org/doi/abs/10.1016/j.sigpro.2004.07.028 [2] Rank-1 constrained multichannel Wiener filter for speech recognition in noisy environments; Z. Wang et al, 2018 https://hal.inria.fr/hal-01634449/document [3] Low-rank approximation based multichannel Wiener filter algorithms for noise reduction with application in cochlear implants; R. Serizel, 2014 https://ieeexplore.ieee.org/document/6730918

Parameters:
- psd_speech (torch.complex64/ComplexTensor) – speech covariance matrix (…, F, C, C)
- psd_noise (torch.complex64/ComplexTensor) – noise covariance matrix (…, F, C, C)
- reference_vector (torch.Tensor or int) – (…, C) or scalar
- denoising_weight (float) – a trade-off parameter between noise reduction and speech distortion. A larger value leads to more noise reduction at the expense of more speech distortion. The plain MWF is obtained with denoising_weight = 1 (by default).
- approx_low_rank_psd_speech (bool) – whether to replace original input psd_speech with its low-rank approximation as in [2]
- iterations (int) – number of iterations in power method, only used when approx_low_rank_psd_speech = True
- diagonal_loading (bool) – Whether to add a tiny term to the diagonal of psd_n
- diag_eps (float)
- eps (float)
Returns: (…, F, C)
Return type: beamform_vector (torch.complex64/ComplexTensor)