asr_recog.py
Less than 1 minute
asr_recog.py
Transcribe text from speech using a speech recognition model on one CPU or GPU
usage: asr_recog.py [-h] [--config CONFIG] [--config2 CONFIG2]
[--config3 CONFIG3] [--ngpu NGPU]
[--dtype {float16,float32,float64}]
[--backend {chainer,pytorch}] [--debugmode DEBUGMODE]
[--seed SEED] [--verbose VERBOSE] [--batchsize BATCHSIZE]
[--preprocess-conf PREPROCESS_CONF] [--api {v1,v2}]
[--recog-json RECOG_JSON] --result-label RESULT_LABEL
--model MODEL [--model-conf MODEL_CONF]
[--num-spkrs {1,2}] [--num-encs NUM_ENCS] [--nbest NBEST]
[--beam-size BEAM_SIZE] [--penalty PENALTY]
[--maxlenratio MAXLENRATIO] [--minlenratio MINLENRATIO]
[--ctc-weight CTC_WEIGHT]
[--weights-ctc-dec WEIGHTS_CTC_DEC]
[--ctc-window-margin CTC_WINDOW_MARGIN]
[--search-type {default,nsc,tsd,alsd,maes}]
[--nstep NSTEP] [--prefix-alpha PREFIX_ALPHA]
[--max-sym-exp MAX_SYM_EXP] [--u-max U_MAX]
[--expansion-gamma EXPANSION_GAMMA]
[--expansion-beta EXPANSION_BETA]
[--score-norm [SCORE_NORM]]
[--softmax-temperature SOFTMAX_TEMPERATURE]
[--rnnlm RNNLM] [--rnnlm-conf RNNLM_CONF]
[--word-rnnlm WORD_RNNLM]
[--word-rnnlm-conf WORD_RNNLM_CONF]
[--word-dict WORD_DICT] [--lm-weight LM_WEIGHT]
[--ngram-model NGRAM_MODEL] [--ngram-weight NGRAM_WEIGHT]
[--ngram-scorer {full,part}]
[--streaming-mode {window,segment}]
[--streaming-window STREAMING_WINDOW]
[--streaming-min-blank-dur STREAMING_MIN_BLANK_DUR]
[--streaming-onset-margin STREAMING_ONSET_MARGIN]
[--streaming-offset-margin STREAMING_OFFSET_MARGIN]
[--maskctc-n-iterations MASKCTC_N_ITERATIONS]
[--maskctc-probability-threshold MASKCTC_PROBABILITY_THRESHOLD]
[--quantize-config [QUANTIZE_CONFIG ...]]
[--quantize-dtype {float16,qint8}]
[--quantize-asr-model QUANTIZE_ASR_MODEL]
[--quantize-lm-model QUANTIZE_LM_MODEL]