espnet2.cls.espnet_model.ESPnetClassificationModel

About 1 min

espnet2.cls.espnet_model.ESPnetClassificationModel

class espnet2.cls.espnet_model.ESPnetClassificationModel(vocab_size: int, token_list: Tuple[str, ...] | List[str], frontend: AbsFrontend | None, specaug: AbsSpecAug | None, normalize: AbsNormalize | None, preencoder: AbsPreEncoder | None, encoder: AbsEncoder, decoder: AbsDecoder, classification_type='multi-class', lsm_weight: float = 0.0, mixup_probability: float = 0.0, log_epoch_metrics: bool = False)

Bases: AbsESPnetModel

Classification model

A simple Classification model

Initializes internal Module state, shared by both nn.Module and ScriptModule.

collect_feats(speech: Tensor, speech_lengths: Tensor, **kwargs) → Dict[str, Tensor]

encode(speech: Tensor, speech_lengths: Tensor) → Tuple[Tensor, Tensor]

Encode the input speech.

Parameters:
- speech – (Batch, Length, …)
- speech_lengths – (Batch,)
Returns: (Batch, Length, n_classes)
Return type: scores

forward(speech: Tensor, speech_lengths: Tensor, label: Tensor, label_lengths: Tensor, **kwargs) → Tuple[Tensor, Dict[str, Tensor], Tensor]

Pass the input through the model and calculate the loss.

Parameters:
- speech – (Batch, samples)
- speech_lengths – (Batch, )
- label – (Batch, Length)
- label_lengths – (Batch, )
Returns: (1,) stats: dict weight
Return type: loss

get_vocab_size()

score(speech: Tensor, speech_lengths: Tensor | None = None) → Tensor

Forward pass at scoring (inference)

Parameters:
- speech – (Batch, samples)
- speech_lengths – (Batch, )
Returns: (Batch, n_classes)
Return type: scores

Assumes Batch=1

setup_metrics_()

update_mAP(mAP_computer)