espnet.lm.lm_utils.count_tokens
Less than 1 minute
espnet.lm.lm_utils.count_tokens
espnet.lm.lm_utils.count_tokens(data, unk_id=None)
Count tokens and oovs in token ID sequences.
- Parameters:
- data (list *[*np.ndarray ]) – list of token ID sequences
- unk_id (int) – ID of unknown token
- Returns: tuple of number of token occurrences and number of oov tokens
- Return type: tuple