asr_align.py
Less than 1 minute
asr_align.py
Align text to audio using CTC segmentation.using a pre-trained speech recognition model.
usage: asr_align.py [-h] [--config CONFIG] [--ngpu NGPU]
[--dtype {float16,float32,float64}] [--backend {pytorch}]
[--debugmode DEBUGMODE] [--verbose VERBOSE]
[--preprocess-conf PREPROCESS_CONF]
[--data-json DATA_JSON] [--utt-text UTT_TEXT] --model
MODEL [--model-conf MODEL_CONF] [--num-encs NUM_ENCS]
[--subsampling-factor SUBSAMPLING_FACTOR]
[--frame-duration FRAME_DURATION]
[--min-window-size MIN_WINDOW_SIZE]
[--max-window-size MAX_WINDOW_SIZE]
[--use-dict-blank USE_DICT_BLANK] [--set-blank SET_BLANK]
[--gratis-blank GRATIS_BLANK]
[--replace-spaces-with-blanks REPLACE_SPACES_WITH_BLANKS]
[--scoring-length SCORING_LENGTH] --output OUTPUT