espnet2.text.char_tokenizer.CharTokenizer
Less than 1 minute
espnet2.text.char_tokenizer.CharTokenizer
class espnet2.text.char_tokenizer.CharTokenizer(non_linguistic_symbols: Path | str | Iterable[str] | None = None, space_symbol: str = '<space>', remove_non_linguistic_symbols: bool = False, nonsplit_symbols: Iterable[str] | None = None)
Bases: AbsTokenizer
text2tokens(line: str) → List[str]
tokens2text(tokens: Iterable[str]) → str