espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming
espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming
class espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming(decoder: AbsDecoder, joint_network: JointNetwork, beam_size: int, lm: Module | None = None, lm_weight: float = 0.1, search_type: str = 'default', max_sym_exp: int = 2, u_max: int = 50, nstep: int = 1, prefix_alpha: int = 1, expansion_gamma: int = 2.3, expansion_beta: int = 2, score_norm: bool = True, score_norm_during: bool = False, nbest: int = 1, penalty: float = 0.0, token_list: List[str] | None = None, hold_n: int = 0)
Bases: object
Beam search implementation for Transducer.
Initialize Transducer search module.
- Parameters:
- decoder – Decoder module.
- joint_network – Joint network module.
- beam_size – Beam size.
- lm – LM class.
- lm_weight – LM weight for soft fusion.
- search_type – Search algorithm to use during inference.
- max_sym_exp – Number of maximum symbol expansions at each time step. (TSD)
- u_max – Maximum output sequence length. (ALSD)
- nstep – Number of maximum expansion steps at each time step. (NSC/mAES)
- prefix_alpha – Maximum prefix length in prefix search. (NSC/mAES)
- expansion_beta – Number of additional candidates for expanded hypotheses selection. (mAES)
- expansion_gamma – Allowed logp difference for prune-by-value method. (mAES)
- score_norm – Normalize final scores by length. (“default”)
- score_norm_during – Normalize scores by length during search. (default, TSD, ALSD)
- nbest – Number of final hypothesis.
align_length_sync_decoding(enc_out: Tensor) → List[Hypothesis]
Alignment-length synchronous beam search implementation.
Based on https://ieeexplore.ieee.org/document/9053040
- Parameters:h – Encoder output sequences. (T, D)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
default_beam_search(enc_out: Tensor) → List[Hypothesis]
Beam search implementation.
Modified from https://arxiv.org/pdf/1211.3711.pdf
- Parameters:enc_out – Encoder output sequence. (T, D)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
greedy_search(enc_out: Tensor) → List[Hypothesis]
Greedy search implementation.
- Parameters:enc_out – Encoder output sequence. (T, D_enc)
- Returns: 1-best hypotheses.
- Return type: hyp
modified_adaptive_expansion_search(enc_out: Tensor) → List[ExtendedHypothesis]
It’s the modified Adaptive Expansion Search (mAES) implementation.
Based on/modified from https://ieeexplore.ieee.org/document/9250505 and NSC.
- Parameters:enc_out – Encoder output sequence. (T, D_enc)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
nsc_beam_search(enc_out: Tensor) → List[ExtendedHypothesis]
N-step constrained beam search implementation.
Based on/Modified from https://arxiv.org/pdf/2002.03577.pdf. Please reference ESPnet (b-flo, PR #2444) for any usage outside ESPnet until further modifications.
- Parameters:enc_out – Encoder output sequence. (T, D_enc)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
prefix_search(hyps: List[ExtendedHypothesis], enc_out_t: Tensor) → List[ExtendedHypothesis]
Prefix search for NSC and mAES strategies.
Based on https://arxiv.org/pdf/1211.3711.pdf
reset()
sort_nbest(hyps: List[Hypothesis] | List[ExtendedHypothesis]) → List[Hypothesis] | List[ExtendedHypothesis]
Sort hypotheses by score or score given sequence length.
- Parameters:hyps – Hypothesis.
- Returns: Sorted hypothesis.
- Return type: hyps
time_sync_decoding(enc_out: Tensor) → List[Hypothesis]
Time synchronous beam search implementation.
Based on https://ieeexplore.ieee.org/document/9053040
- Parameters:enc_out – Encoder output sequence. (T, D)
- Returns: N-best hypothesis.
- Return type: nbest_hyps