espnet.nets.beam_search_partially_AR.PartiallyARBeamSearch
espnet.nets.beam_search_partially_AR.PartiallyARBeamSearch
class espnet.nets.beam_search_partially_AR.PartiallyARBeamSearch(*args, **kwargs)
Bases: BatchBeamSearch
Partially autoregressive beam search implementation. Partially autoregressive hypothesis is a set of BatchHypothesis.
We need to use add_mask function to add a hypothesis for a mask. Before search and beam search method, each partially autoregressive hypothesis is extracted to BatchHypothesis, and applied the same process as the batched_beam_search.
Initialize beam search.
- Parameters:
- scorers (dict *[*str , ScorerInterface ]) – Dict of decoder modules e.g., Decoder, CTCPrefixScorer, LM The scorer will be ignored if it is None
- weights (dict *[*str , float ]) – Dict of weights for each scorers The scorer will be ignored if its weight is 0
- beam_size (int) – The number of hypotheses kept during search
- vocab_size (int) – The number of vocabulary
- sos (int) – Start of sequence id
- eos (int) – End of sequence id
- token_list (list *[*str ]) – List of tokens for debug log
- pre_beam_score_key (str) – key of scores to perform pre-beam search
- pre_beam_ratio (float) – beam size in the pre-beam search will be int(pre_beam_ratio * beam_size)
- return_hs (bool) – Whether to return hidden intermediates
- normalize_length (bool) – If true, select the best ended hypotheses based on length-normalized scores rather than the accumulated scores
add_mask(primer: List[int], eos: int)
Add a mask to a batch of hypotheses.
- Parameters:primer (torch.Tensor) – Primer yseq.
batch_beam(weighted_scores: Tensor) → Tuple[Tensor, Tensor]
Batch-compute topk full token ids and partial token ids.
- Parameters:weighted_scores (torch.Tensor) – The weighted sum scores for each tokens. Its shape is (n_beam, self.vocab_size).
- Returns: The topk full (prev_hyp, new_token) ids and partial (prev_hyp, new_token) ids. Their shapes are all (self.beam_size,)
- Return type: Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]
forward(x: Tensor, max_seq_len: int | None = None) → List[Hypothesis]
Perform beam search.
- Parameters:
- x (torch.Tensor) – Encoded speech feature (T, D)
- maxlenratio (float) – Input length ratio to obtain max output length. If maxlenratio=0.0 (default), it uses a end-detect function to automatically find maximum hypothesis lengths If maxlenratio<0.0, its absolute value is interpreted as a constant max output length.
- minlenratio (float) – Input length ratio to obtain min output length.
- Returns: N-best decoding results
- Return type: list[Hypothesis]
init_hyp(x: Tensor) → PartiallyARHypothesis
Get an initial hypothesis data for each mask.
- Parameters:x (torch.Tensor) – The encoder output feature
- Returns: The initial hypothesis.
- Return type:PartiallyARHypothesis
init_masks()
post_process(i: int, maxlen: int, running_hyps: PartiallyARHypothesis, ended_hyps: List[List[Hypothesis]]) → BatchHypothesis
Perform post-processing of beam search iterations. Extract BatchHypothesis for each mask, and perform post-process. Then merge BatchHypothesis.
- Parameters:
- i (int) – The length of hypothesis tokens.
- maxlen (int) – The maximum length of tokens in beam search.
- maxlenratio (int) – The maximum length ratio in beam search.
- running_hyps (BatchHypothesis) – The running hypotheses in beam search.
- ended_hyps (List [Hypothesis ]) – The ended hypotheses in beam search.
- Returns: The new running hypotheses.
- Return type:BatchHypothesis
score_full(hyp: PartiallyARHypothesis, x: Tensor, is_first: bool = False) → Tuple[Dict[str, Tensor], Dict[str, Any]]
Score new hypothesis by self.full_scorers.
- Parameters:
- hyp (PartiallyARHypothesis) – Hypothesis with prefix tokens to score
- x (torch.Tensor) – Corresponding input feature
- Returns: Tuple of : score dict of hyp that has string keys of self.full_scorers and tensor score values of shape: (self.n_vocab,), and state dict that has string keys and state values of self.full_scorers
- Return type: Tuple[Dict[str, torch.Tensor], Dict[str, Any]]
search(running_hyps: PartiallyARHypothesis, x: Tensor) → PartiallyARHypothesis
Search new tokens for running hypotheses and encoded speech x.
- Parameters:
- running_hyps (BatchHypothesis) – Running hypotheses on beam
- x (torch.Tensor) – Encoded speech feature (T, D)
- Returns: Best sorted hypotheses
- Return type:BatchHypothesis