espnet2.cls.espnet_model.ESPnetClassificationModel
espnet2.cls.espnet_model.ESPnetClassificationModel
class espnet2.cls.espnet_model.ESPnetClassificationModel(vocab_size: int, token_list: Tuple[str, ...] | List[str], frontend: AbsFrontend | None, specaug: AbsSpecAug | None, normalize: AbsNormalize | None, preencoder: AbsPreEncoder | None, encoder: AbsEncoder, decoder: AbsDecoder, classification_type='multi-class', lsm_weight: float = 0.0, mixup_probability: float = 0.0, log_epoch_metrics: bool = False)
Bases: AbsESPnetModel
Classification model
A simple Classification model
Initializes internal Module state, shared by both nn.Module and ScriptModule.
collect_feats(speech: Tensor, speech_lengths: Tensor, **kwargs) → Dict[str, Tensor]
encode(speech: Tensor, speech_lengths: Tensor) → Tuple[Tensor, Tensor]
Encode the input speech.
- Parameters:
- speech – (Batch, Length, …)
- speech_lengths – (Batch,)
- Returns: (Batch, Length, n_classes)
- Return type: scores
forward(speech: Tensor, speech_lengths: Tensor, label: Tensor, label_lengths: Tensor, **kwargs) → Tuple[Tensor, Dict[str, Tensor], Tensor]
Pass the input through the model and calculate the loss.
- Parameters:
- speech – (Batch, samples)
- speech_lengths – (Batch, )
- label – (Batch, Length)
- label_lengths – (Batch, )
- Returns: (1,) stats: dict weight
- Return type: loss
get_vocab_size()
score(speech: Tensor, speech_lengths: Tensor | None = None) → Tensor
Forward pass at scoring (inference)
- Parameters:
- speech – (Batch, samples)
- speech_lengths – (Batch, )
- Returns: (Batch, n_classes)
- Return type: scores
Assumes Batch=1
setup_metrics_()
update_mAP(mAP_computer)