espnet2.asr_transducer.beam_search_transducer.BeamSearchTransducer

About 2 min

espnet2.asr_transducer.beam_search_transducer.BeamSearchTransducer

class espnet2.asr_transducer.beam_search_transducer.BeamSearchTransducer(decoder: AbsDecoder, joint_network: JointNetwork, beam_size: int, lm: Module | None = None, lm_weight: float = 0.1, search_type: str = 'default', max_sym_exp: int = 3, u_max: int = 50, nstep: int = 2, expansion_gamma: float = 2.3, expansion_beta: int = 2, score_norm: bool = False, nbest: int = 1, streaming: bool = False)

Bases: object

Beam search implementation for Transducer.

Parameters:
- decoder – Decoder module.
- joint_network – Joint network module.
- beam_size – Size of the beam.
- lm – LM module.
- lm_weight – LM weight for soft fusion.
- search_type – Search algorithm to use during inference.
- max_sym_exp – Number of maximum symbol expansions at each time step. (TSD)
- u_max – Maximum expected target sequence length. (ALSD)
- nstep – Number of maximum expansion steps at each time step. (mAES)
- expansion_gamma – Allowed logp difference for prune-by-value method. (mAES)
- expansion_beta – Number of additional candidates for expanded hypotheses selection. (mAES)
- score_norm – Normalize final scores by length.
- nbest – Number of final hypothesis.
- streaming – Whether to perform chunk-by-chunk beam search.

Construct a BeamSearchTransducer object.

align_length_sync_decoding(enc_out: Tensor) → List[Hypothesis]

Alignment-length synchronous beam search implementation.

Based on https://ieeexplore.ieee.org/document/9053040

Parameters:h – Encoder output sequences. (T, D)
Returns: N-best hypothesis.
Return type: nbest_hyps

create_lm_batch_inputs(hyps_seq: List[List[int]]) → Tensor

Make batch of inputs with left padding for LM scoring.

Parameters:hyps_seq – Hypothesis sequences.
Returns: Padded batch of sequences.

default_beam_search(enc_out: Tensor) → List[Hypothesis]

Beam search implementation without prefix search.

Modified from https://arxiv.org/pdf/1211.3711.pdf

Parameters:enc_out – Encoder output sequence. (T, D)
Returns: N-best hypothesis.
Return type: nbest_hyps

modified_adaptive_expansion_search(enc_out: Tensor) → List[ExtendedHypothesis]

Modified version of Adaptive Expansion Search (mAES).

Based on AES (https://ieeexplore.ieee.org/document/9250505) and : NSC (https://arxiv.org/abs/2201.05420).

Parameters:enc_out – Encoder output sequence. (T, D_enc)
Returns: N-best hypothesis.
Return type: nbest_hyps

recombine_hyps(hyps: List[Hypothesis]) → List[Hypothesis]

Recombine hypotheses with same label ID sequence.

Parameters:hyps – Hypotheses.
Returns: Recombined hypotheses.
Return type: final

reset_cache() → None

Reset cache for streaming decoding.

select_k_expansions(hyps: List[ExtendedHypothesis], topk_idx: Tensor, topk_logp: Tensor) → List[ExtendedHypothesis]

Return K hypotheses candidates for expansion from a list of hypothesis.

K candidates are selected according to the extended hypotheses probabilities and a prune-by-value method. Where K is equal to beam_size + beta.

Parameters:
- hyps – Hypotheses.
- topk_idx – Indices of candidates hypothesis.
- topk_logp – Log-probabilities of candidates hypothesis.
Returns: Best K expansion hypotheses candidates.
Return type: k_expansions

sort_nbest(hyps: List[Hypothesis]) → List[Hypothesis]

Sort in-place hypotheses by score or score given sequence length.

Parameters:hyps – Hypothesis.
Returns: Sorted hypothesis.
Return type: hyps

time_sync_decoding(enc_out: Tensor) → List[Hypothesis]

Time synchronous beam search implementation.

Based on https://ieeexplore.ieee.org/document/9053040

Parameters:enc_out – Encoder output sequence. (T, D)
Returns: N-best hypothesis.
Return type: nbest_hyps