Skip to main content
Demos
Roadmap
ESPnet2
Demo
Course
ESPnet-EZ
ESPnet EZ
ESPnet1 (Legacy)
ESPnet1
Recipes
What is a recipe template?
Automatic Speech Recognition (Multi-tasking)
Automatic Speech Recognition with Discrete Units
Speaker Verification Spoofing and Countermeasures
Speech Codec
Speaker Diarisation
Speech Enhancement
Speech Recognition with Speech Enhancement
Speaker Diarisation with Speech Enhancement
Speech-to-Text Translation with Speech Enhancement
Language Modeling
Machine Translation
Speech-to-Speech Translation
Weakly-supervised Learning (Speech-to-Text)
ESPnet-SDS
Spoken Language Understanding
Speech Language Model
Speaker Representation
Self-supervised Learning
Speech-to-Text Translation
Singing Voice Synthesis
Text-to-Speech
Text-to-Speech with Discrete Units
Unsupervised Automatic Speech Recognition
Python API
espnet
asr
distributed
lm
mt
nets
optimizer
scheduler
st
transform
tts
utils
vc
espnet2
asr
asr_transducer
asvspoof
diar
enh
fileio
fst
gan_codec
gan_svs
gan_tts
hubert
iterators
layers
lm
main_funcs
mt
optimizers
s2st
s2t
samplers
schedulers
sds
slu
speechlm
spk
st
svs
tasks
text
torch_utils
train
tts
tts2
uasr
utils
espnetez
config
data
dataloader
dataset
preprocess
task
trainer
Shell API
espnet2_bin
espnet_bin
spm
utils
utils_py
Search
Ctrl
K
Tools
Less than 1 minute
Catalog
#
Espnet Bin
#
asr_align.py
#
asr_enhance.py
#
asr_recog.py
#
asr_train.py
#
lm_train.py
#
mt_train.py
#
mt_trans.py
#
st_train.py
#
st_trans.py
#
tts_decode.py
#
tts_train.py
#
vc_decode.py
#
vc_train.py
#
Espnet2 Bin
#
aggregate_stats_dirs.py
#
asr_align.py
#
asr_inference_maskctc.py
#
asr_inference_streaming.py
#
asr_inference.py
#
asr_transducer_inference.py
#
asvspoof_inference.py
#
diar_inference.py
#
enh_inference_streaming.py
#
enh_inference.py
#
enh_scoring.py
#
enh_tse_inference.py
#
gan_codec_inference.py
#
hugging_face_export_vocabulary.py
#
launch.py
#
lm_calc_perplexity.py
#
lm_inference.py
#
mt_inference.py
#
pack.py
#
s2st_inference.py
#
s2t_ctc_align.py
#
s2t_inference_ctc.py
#
s2t_inference_language.py
#
s2t_inference.py
#
slu_inference.py
#
speechlm_inference.py
#
spk_embed_extract.py
#
spk_inference.py
#
split_scps.py
#
st_inference_streaming.py
#
st_inference.py
#
svs_inference.py
#
tokenize_text.py
#
tts_inference.py
#
tts2_inference.py
#
uasr_extract_feature.py
#
uasr_inference_k2.py
#
uasr_inference.py
#
whisper_export_vocabulary.py
#
Spm
#
spm_decode
#
spm_encode
#
Utils
#
asr_align_wav.sh
#
clean_corpus.sh
#
convert_fbank.sh
#
data2json.sh
#
divide_lang.sh
#
download_from_google_drive.sh
#
dump_pcm.sh
#
dump.sh
#
eval_source_separation.sh
#
feat_to_shape.sh
#
free-gpu.sh
#
generate_wav.sh
#
make_fbank.sh
#
make_stft.sh
#
pack_model.sh
#
recog_wav.sh
#
reduce_data_dir.sh
#
remove_longshortdata.sh
#
score_bleu.sh
#
score_sclite_case.sh
#
score_sclite_wo_dict.sh
#
score_sclite.sh
#
show_result.sh
#
speed_perturb.sh
#
synth_wav.sh
#
translate_wav.sh
#
trim_silence.sh
#
update_json.sh
#
Utils Py
#
addjson.py
#
apply-cmvn.py
#
average_checkpoints.py
#
calculate_rtf.py
#
change_yaml.py
#
compute-cmvn-stats.py
#
compute-fbank-feats.py
#
compute-stft-feats.py
#
concat_json_multiref.py
#
concatjson.py
#
convert_fbank_to_wav.py
#
copy-feats.py
#
dump-pcm.py
#
eval_perm_free_error.py
#
eval-source-separation.py
#
feat-to-shape.py
#
feats2npy.py
#
filt.py
#
generate_wav_from_fbank.py
#
get_yaml.py
#
json2sctm.py
#
json2text.py
#
json2trn_mt.py
#
json2trn_wo_dict.py
#
json2trn.py
#
make_pair_json.py
#
mcd_calculate.py
#
merge_scp2json.py
#
mergejson.py
#
mix-mono-wav-scp.py
#
result2json.py
#
score_lang_id.py
#
scp2json.py
#
splitjson.py
#
text2token.py
#
text2vocabulary.py
#
trim_silence.py
#
trn2ctm.py
#
trn2stm.py
Prev
Python API