espnet2.asr_transducer.frontend.online_audio_processor.OnlineAudioProcessor
Less than 1 minute
espnet2.asr_transducer.frontend.online_audio_processor.OnlineAudioProcessor
class espnet2.asr_transducer.frontend.online_audio_processor.OnlineAudioProcessor(feature_extractor: Module, normalization_module: Module, decoding_window: int, encoder_sub_factor: int, frontend_conf: Dict, device: device, audio_sampling_rate: int = 16000)
Bases: object
OnlineProcessor module definition.
- Parameters:
- feature_extractor – Feature extractor module.
- normalization_module – Normalization module.
- decoding_window – Size of the decoding window (in ms).
- encoder_sub_factor – Encoder subsampling factor.
- frontend_conf – Frontend configuration.
- device – Device to pin module tensors on.
- audio_sampling_rate – Input sampling rate.
Construct an OnlineAudioProcessor.
compute_features(samples: Tensor, is_final: bool) → None
Compute features from input samples.
- Parameters:
- samples – Speech data. (S)
- is_final – Whether speech corresponds to the final chunk of data.
- Returns: Features sequence. (1, chunk_sz_bs, D_feats) feats_length: Features length sequence. (1,)
- Return type: feats
get_current_feats(feats: Tensor, feats_length: Tensor, is_final: bool) → Tuple[Tensor, Tensor]
Get features for current decoding window.
- Parameters:
- feats – Computed features sequence. (1, F, D_feats)
- feats_length – Computed features sequence length. (1,)
- is_final – Whether feats corresponds to the final chunk of data.
- Returns: Decoding window features sequence. (1, chunk_sz_bs, D_feats) feats_length: Decoding window features length sequence. (1,)
- Return type: feats
get_current_samples(samples: Tensor, is_final: bool) → Tensor
Get samples for feature computation.
- Parameters:
- samples – Speech data. (S)
- is_final – Whether speech corresponds to the final chunk of data.
- Returns: New speech data. (1, decoding_samples)
- Return type: samples
reset_cache() → None
Reset cache parameters.
- Parameters:None
- Returns: None