espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming

About 2 min

espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming

class espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming(decoder: AbsDecoder, joint_network: JointNetwork, beam_size: int, lm: Module | None = None, lm_weight: float = 0.1, search_type: str = 'default', max_sym_exp: int = 2, u_max: int = 50, nstep: int = 1, prefix_alpha: int = 1, expansion_gamma: int = 2.3, expansion_beta: int = 2, score_norm: bool = True, score_norm_during: bool = False, nbest: int = 1, penalty: float = 0.0, token_list: List[str] | None = None, hold_n: int = 0)

Bases: object

Beam search implementation for Transducer.

Initialize Transducer search module.

Parameters:
- decoder – Decoder module.
- joint_network – Joint network module.
- beam_size – Beam size.
- lm – LM class.
- lm_weight – LM weight for soft fusion.
- search_type – Search algorithm to use during inference.
- max_sym_exp – Number of maximum symbol expansions at each time step. (TSD)
- u_max – Maximum output sequence length. (ALSD)
- nstep – Number of maximum expansion steps at each time step. (NSC/mAES)
- prefix_alpha – Maximum prefix length in prefix search. (NSC/mAES)
- expansion_beta – Number of additional candidates for expanded hypotheses selection. (mAES)
- expansion_gamma – Allowed logp difference for prune-by-value method. (mAES)
- score_norm – Normalize final scores by length. (“default”)
- score_norm_during – Normalize scores by length during search. (default, TSD, ALSD)
- nbest – Number of final hypothesis.

align_length_sync_decoding(enc_out: Tensor) → List[Hypothesis]

Alignment-length synchronous beam search implementation.

Based on https://ieeexplore.ieee.org/document/9053040

Parameters:h – Encoder output sequences. (T, D)
Returns: N-best hypothesis.
Return type: nbest_hyps

default_beam_search(enc_out: Tensor) → List[Hypothesis]

Beam search implementation.

Modified from https://arxiv.org/pdf/1211.3711.pdf

Parameters:enc_out – Encoder output sequence. (T, D)
Returns: N-best hypothesis.
Return type: nbest_hyps

greedy_search(enc_out: Tensor) → List[Hypothesis]

Greedy search implementation.

Parameters:enc_out – Encoder output sequence. (T, D_enc)
Returns: 1-best hypotheses.
Return type: hyp

modified_adaptive_expansion_search(enc_out: Tensor) → List[ExtendedHypothesis]

It’s the modified Adaptive Expansion Search (mAES) implementation.

Based on/modified from https://ieeexplore.ieee.org/document/9250505 and NSC.

Parameters:enc_out – Encoder output sequence. (T, D_enc)
Returns: N-best hypothesis.
Return type: nbest_hyps

nsc_beam_search(enc_out: Tensor) → List[ExtendedHypothesis]

N-step constrained beam search implementation.

Based on/Modified from https://arxiv.org/pdf/2002.03577.pdf. Please reference ESPnet (b-flo, PR #2444) for any usage outside ESPnet until further modifications.

Parameters:enc_out – Encoder output sequence. (T, D_enc)
Returns: N-best hypothesis.
Return type: nbest_hyps

prefix_search(hyps: List[ExtendedHypothesis], enc_out_t: Tensor) → List[ExtendedHypothesis]

Prefix search for NSC and mAES strategies.

Based on https://arxiv.org/pdf/1211.3711.pdf

reset()

sort_nbest(hyps: List[Hypothesis] | List[ExtendedHypothesis]) → List[Hypothesis] | List[ExtendedHypothesis]

Sort hypotheses by score or score given sequence length.

Parameters:hyps – Hypothesis.
Returns: Sorted hypothesis.
Return type: hyps

time_sync_decoding(enc_out: Tensor) → List[Hypothesis]

Time synchronous beam search implementation.

Based on https://ieeexplore.ieee.org/document/9053040

Parameters:enc_out – Encoder output sequence. (T, D)
Returns: N-best hypothesis.
Return type: nbest_hyps