asr_align.py

Less than 1 minute

asr_align.py

Align text to audio using CTC segmentation.using a pre-trained speech recognition model.

usage: asr_align.py [-h] [--config CONFIG] [--ngpu NGPU]
                    [--dtype {float16,float32,float64}] [--backend {pytorch}]
                    [--debugmode DEBUGMODE] [--verbose VERBOSE]
                    [--preprocess-conf PREPROCESS_CONF]
                    [--data-json DATA_JSON] [--utt-text UTT_TEXT] --model
                    MODEL [--model-conf MODEL_CONF] [--num-encs NUM_ENCS]
                    [--subsampling-factor SUBSAMPLING_FACTOR]
                    [--frame-duration FRAME_DURATION]
                    [--min-window-size MIN_WINDOW_SIZE]
                    [--max-window-size MAX_WINDOW_SIZE]
                    [--use-dict-blank USE_DICT_BLANK] [--set-blank SET_BLANK]
                    [--gratis-blank GRATIS_BLANK]
                    [--replace-spaces-with-blanks REPLACE_SPACES_WITH_BLANKS]
                    [--scoring-length SCORING_LENGTH] --output OUTPUT

asr_align.py

asr_align.py

Named Arguments