espnet2.train.preprocessor.S2TCTCPreprocessor
espnet2.train.preprocessor.S2TCTCPreprocessor
class espnet2.train.preprocessor.S2TCTCPreprocessor(train: bool, token_type: str | None = None, token_list: Path | str | Iterable[str] | None = None, bpemodel: Path | str | Iterable[str] | None = None, text_cleaner: Collection[str] | None = None, g2p_type: str | None = None, unk_symbol: str = '<unk>', space_symbol: str = '<space>', non_linguistic_symbols: Path | str | Iterable[str] | None = None, delimiter: str | None = None, rir_scp: str | None = None, rir_apply_prob: float = 1.0, noise_scp: str | None = None, noise_apply_prob: float = 1.0, noise_db_range: str = '3_10', short_noise_thres: float = 0.5, speech_volume_normalize: float | None = None, speech_name: str = 'speech', text_name: str = 'text', text_prev_name: str = 'text_prev', text_ctc_name: str = 'text_ctc', fs: int = 16000, na_symbol: str = '<na>', speech_length: float = 30, speech_init_silence: float = 1.0, text_prev_apply_prob: float = 0.5, lang_apply_prob: float = 0.5, nolang_symbol: str = '<nolang>')
Bases: CommonPreprocessor
Preprocessor for OWSM-CTC.