espnet2.asr_transducer.beam_search_transducer.BeamSearchTransducer
espnet2.asr_transducer.beam_search_transducer.BeamSearchTransducer
class espnet2.asr_transducer.beam_search_transducer.BeamSearchTransducer(decoder: AbsDecoder, joint_network: JointNetwork, beam_size: int, lm: Module | None = None, lm_weight: float = 0.1, search_type: str = 'default', max_sym_exp: int = 3, u_max: int = 50, nstep: int = 2, expansion_gamma: float = 2.3, expansion_beta: int = 2, score_norm: bool = False, nbest: int = 1, streaming: bool = False)
Bases: object
Beam search implementation for Transducer.
- Parameters:
- decoder – Decoder module.
- joint_network – Joint network module.
- beam_size – Size of the beam.
- lm – LM module.
- lm_weight – LM weight for soft fusion.
- search_type – Search algorithm to use during inference.
- max_sym_exp – Number of maximum symbol expansions at each time step. (TSD)
- u_max – Maximum expected target sequence length. (ALSD)
- nstep – Number of maximum expansion steps at each time step. (mAES)
- expansion_gamma – Allowed logp difference for prune-by-value method. (mAES)
- expansion_beta – Number of additional candidates for expanded hypotheses selection. (mAES)
- score_norm – Normalize final scores by length.
- nbest – Number of final hypothesis.
- streaming – Whether to perform chunk-by-chunk beam search.
Construct a BeamSearchTransducer object.
align_length_sync_decoding(enc_out: Tensor) → List[Hypothesis]
Alignment-length synchronous beam search implementation.
Based on https://ieeexplore.ieee.org/document/9053040
- Parameters:h – Encoder output sequences. (T, D)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
create_lm_batch_inputs(hyps_seq: List[List[int]]) → Tensor
Make batch of inputs with left padding for LM scoring.
- Parameters:hyps_seq – Hypothesis sequences.
- Returns: Padded batch of sequences.
default_beam_search(enc_out: Tensor) → List[Hypothesis]
Beam search implementation without prefix search.
Modified from https://arxiv.org/pdf/1211.3711.pdf
- Parameters:enc_out – Encoder output sequence. (T, D)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
modified_adaptive_expansion_search(enc_out: Tensor) → List[ExtendedHypothesis]
Modified version of Adaptive Expansion Search (mAES).
Based on AES (https://ieeexplore.ieee.org/document/9250505) and : NSC (https://arxiv.org/abs/2201.05420).
- Parameters:enc_out – Encoder output sequence. (T, D_enc)
- Returns: N-best hypothesis.
- Return type: nbest_hyps
recombine_hyps(hyps: List[Hypothesis]) → List[Hypothesis]
Recombine hypotheses with same label ID sequence.
- Parameters:hyps – Hypotheses.
- Returns: Recombined hypotheses.
- Return type: final
reset_cache() → None
Reset cache for streaming decoding.
select_k_expansions(hyps: List[ExtendedHypothesis], topk_idx: Tensor, topk_logp: Tensor) → List[ExtendedHypothesis]
Return K hypotheses candidates for expansion from a list of hypothesis.
K candidates are selected according to the extended hypotheses probabilities and a prune-by-value method. Where K is equal to beam_size + beta.
- Parameters:
- hyps – Hypotheses.
- topk_idx – Indices of candidates hypothesis.
- topk_logp – Log-probabilities of candidates hypothesis.
- Returns: Best K expansion hypotheses candidates.
- Return type: k_expansions
sort_nbest(hyps: List[Hypothesis]) → List[Hypothesis]
Sort in-place hypotheses by score or score given sequence length.
- Parameters:hyps – Hypothesis.
- Returns: Sorted hypothesis.
- Return type: hyps
time_sync_decoding(enc_out: Tensor) → List[Hypothesis]
Time synchronous beam search implementation.
Based on https://ieeexplore.ieee.org/document/9053040
- Parameters:enc_out – Encoder output sequence. (T, D)
- Returns: N-best hypothesis.
- Return type: nbest_hyps