espnet2.asr.ctc.CTC
Less than 1 minute
espnet2.asr.ctc.CTC
class espnet2.asr.ctc.CTC(odim: int, encoder_output_size: int, dropout_rate: float = 0.0, ctc_type: str = 'builtin', reduce: bool = True, ignore_nan_grad: bool | None = None, zero_infinity: bool = True, brctc_risk_strategy: str = 'exp', brctc_group_strategy: str = 'end', brctc_risk_factor: float = 0.0)
Bases: Module
CTC module.
- Parameters:
- odim – dimension of outputs
- encoder_output_size – number of encoder projection units
- dropout_rate – dropout rate (0.0 ~ 1.0)
- ctc_type – builtin or gtnctc
- reduce – reduce the CTC loss into a scalar
- ignore_nan_grad – Same as zero_infinity (keeping for backward compatiblity)
- zero_infinity – Whether to zero infinite losses and the associated gradients.
Initializes internal Module state, shared by both nn.Module and ScriptModule.
argmax(hs_pad)
argmax of frame activations
- Parameters:hs_pad (torch.Tensor) – 3d tensor (B, Tmax, eprojs)
- Returns: argmax applied 2d tensor (B, Tmax)
- Return type: torch.Tensor
forward(hs_pad, hlens, ys_pad, ys_lens)
Calculate CTC loss.
- Parameters:
- hs_pad – batch of padded hidden state sequences (B, Tmax, D)
- hlens – batch of lengths of hidden state sequences (B)
- ys_pad – batch of padded character id sequence tensor (B, Lmax)
- ys_lens – batch of lengths of character sequence (B)
log_softmax(hs_pad)
log_softmax of frame activations
- Parameters:hs_pad (Tensor) – 3d tensor (B, Tmax, eprojs)
- Returns: log softmax applied 3d tensor (B, Tmax, odim)
- Return type: torch.Tensor
loss_fn(th_pred, th_target, th_ilen, th_olen) → Tensor
softmax(hs_pad)
softmax of frame activations
- Parameters:hs_pad (Tensor) – 3d tensor (B, Tmax, eprojs)
- Returns: softmax applied 3d tensor (B, Tmax, odim)
- Return type: torch.Tensor