espnet.nets.ctc_prefix_score.CTCPrefixScoreTH
espnet.nets.ctc_prefix_score.CTCPrefixScoreTH
class espnet.nets.ctc_prefix_score.CTCPrefixScoreTH(x, xlens, blank, eos, margin=0)
Bases: object
Batch processing of CTCPrefixScore
which is based on Algorithm 2 in WATANABE et al. “HYBRID CTC/ATTENTION ARCHITECTURE FOR END-TO-END SPEECH RECOGNITION,” but extended to efficiently compute the label probablities for multiple hypotheses simultaneously See also Seki et al. “Vectorized Beam Search for CTC-Attention-Based Speech Recognition,” In INTERSPEECH (pp. 3825-3829), 2019.
Construct CTC prefix scorer
- Parameters:
- x (torch.Tensor) – input label posterior sequences (B, T, O)
- xlens (torch.Tensor) – input lengths (B,)
- blank (int) – blank label id
- eos (int) – end-of-sequence id
- margin (int) – margin parameter for windowing (0 means no windowing)
extend_prob(x)
Extend CTC prob.
- Parameters:x (torch.Tensor) – input label posterior sequences (B, T, O)
extend_state(state)
Compute CTC prefix state.
:param state : CTC state :return ctc_state
index_select_state(state, best_ids)
Select CTC states according to best ids
:param state : CTC state :param best_ids : index numbers selected by beam pruning (B, W) :return selected_state