espnet2.text.qwen2audio_tokenizer.Qwen2AudioTokenizer
Less than 1 minute
espnet2.text.qwen2audio_tokenizer.Qwen2AudioTokenizer
class espnet2.text.qwen2audio_tokenizer.Qwen2AudioTokenizer(model_name: str = 'Qwen/Qwen2-Audio-7B-Instruct')
Bases: AbsTokenizer
Qwen2-Audio tokenizer that handles both text and audio inputs
create_multimodal_query(text_input: str, audio_input: Tuple[List[ndarray], int] | None = None) → Dict
Create query with both text and audio inputs for Qwen2-Audio.
This is the core tokenization process from the original example.
text2tokens(line: str) → List[str]
Convert text to tokens using Qwen2-Audio processor
tokens2text(tokens: Iterable[str]) → str
Convert tokens back to text
