espnet.nets.beam_search.beam_search

Less than 1 minute

espnet.nets.beam_search.beam_search

espnet.nets.beam_search.beam_search(x: Tensor, sos: int, eos: int, beam_size: int, vocab_size: int, scorers: Dict[str, ScorerInterface], weights: Dict[str, float], token_list: List[str] | None = None, maxlenratio: float = 0.0, minlenratio: float = 0.0, pre_beam_ratio: float = 1.5, pre_beam_score_key: str = 'full') → list

Perform beam search with scorers.

Parameters:
- x (torch.Tensor) – Encoded speech feature (T, D)
- sos (int) – Start of sequence id
- eos (int) – End of sequence id
- beam_size (int) – The number of hypotheses kept during search
- vocab_size (int) – The number of vocabulary
- scorers (dict *[*str , ScorerInterface ]) – Dict of decoder modules e.g., Decoder, CTCPrefixScorer, LM The scorer will be ignored if it is None
- weights (dict *[*str , float ]) – Dict of weights for each scorers The scorer will be ignored if its weight is 0
- token_list (list *[*str ]) – List of tokens for debug log
- maxlenratio (float) – Input length ratio to obtain max output length. If maxlenratio=0.0 (default), it uses a end-detect function to automatically find maximum hypothesis lengths
- minlenratio (float) – Input length ratio to obtain min output length.
- pre_beam_score_key (str) – key of scores to perform pre-beam search
- pre_beam_ratio (float) – beam size in the pre-beam search will be int(pre_beam_ratio * beam_size)
Returns: N-best decoding results
Return type: list