Skip to main content
Tutorials
Full ESPnet installation
ESPnet2
ESPnet1
Training configurations
Recipe tips
Audio formatting
Task class and data input system
Docker
Job scheduling system
Distributed training
Document Generation
Demos
Roadmap
ESPnet2
Demo
Course
ESPnet-EZ
ESPnet EZ
ESPnet1 (Legacy)
ESPnet1
Recipes
What is a recipe template?
Automatic Speech Recognition (Multi-tasking)
Automatic Speech Recognition with Discrete Units
Speaker Verification Spoofing and Countermeasures
Classification
Speech Codec
Speaker Diarisation
Speech Enhancement
Speech Recognition with Speech Enhancement
Speaker Diarisation with Speech Enhancement
Speech-to-Text Translation with Speech Enhancement
Self-supervised Learning
Language Modeling
Machine Translation
Speech-to-Speech Translation
Weakly-supervised Learning (Speech-to-Text)
ESPnet-SDS
Spoken Language Understanding
Speech Language Model
Speaker Representation
Self-supervised Learning
Speech-to-Text Translation
Singing Voice Synthesis
Text-to-Speech
Text-to-Speech with Discrete Units
Unsupervised Automatic Speech Recognition
Python API
espnet
asr
distributed
lm
mt
nets
optimizer
scheduler
st
transform
tts
utils
vc
espnet2
asr
asr_transducer
asvspoof
cls
diar
enh
fileio
fst
gan_codec
gan_svs
gan_tts
hubert
iterators
layers
lm
main_funcs
mt
optimizers
s2st
s2t
samplers
schedulers
sds
slu
speechlm
spk
ssl
st
svs
tasks
text
torch_utils
train
tts
tts2
uasr
utils
espnetez
config
data
dataloader
dataset
preprocess
task
trainer
Shell API
espnet2_bin
espnet_bin
spm
utils
utils_py
Search
Ctrl
K
Asr
Less than 1 minute
Catalog
espnet2.asr.bayes_risk_ctc.BayesRiskCTC
espnet2.asr.bayes_risk_ctc.log_substraction_exp
espnet2.asr.ctc.CTC
espnet2.asr.decoder.abs_decoder.AbsDecoder
espnet2.asr.decoder.hugging_face_transformers_decoder.get_hugging_face_model_lm_head
espnet2.asr.decoder.hugging_face_transformers_decoder.get_hugging_face_model_network
espnet2.asr.decoder.hugging_face_transformers_decoder.HuggingFaceTransformersDecoder
espnet2.asr.decoder.hugging_face_transformers_decoder.read_json_config
espnet2.asr.decoder.linear_decoder.LinearDecoder
espnet2.asr.decoder.mlm_decoder.MLMDecoder
espnet2.asr.decoder.rnn_decoder.build_attention_list
espnet2.asr.decoder.rnn_decoder.RNNDecoder
espnet2.asr.decoder.s4_decoder.S4Decoder
espnet2.asr.decoder.transducer_decoder.TransducerDecoder
espnet2.asr.decoder.transformer_decoder.BaseTransformerDecoder
espnet2.asr.decoder.transformer_decoder.DynamicConvolution2DTransformerDecoder
espnet2.asr.decoder.transformer_decoder.DynamicConvolutionTransformerDecoder
espnet2.asr.decoder.transformer_decoder.LightweightConvolution2DTransformerDecoder
espnet2.asr.decoder.transformer_decoder.LightweightConvolutionTransformerDecoder
espnet2.asr.decoder.transformer_decoder.TransformerDecoder
espnet2.asr.decoder.transformer_decoder.TransformerMDDecoder
espnet2.asr.decoder.whisper_decoder.ExpandedTokenEmbedding
espnet2.asr.decoder.whisper_decoder.OpenAIWhisperDecoder
espnet2.asr.discrete_asr_espnet_model.ESPnetDiscreteASRModel
espnet2.asr.encoder.abs_encoder.AbsEncoder
espnet2.asr.encoder.avhubert_encoder.AVHubertConfig
espnet2.asr.encoder.avhubert_encoder.AVHubertModel
espnet2.asr.encoder.avhubert_encoder.BasicBlock
espnet2.asr.encoder.avhubert_encoder.conv3x3
espnet2.asr.encoder.avhubert_encoder.download_avhubert
espnet2.asr.encoder.avhubert_encoder.downsample_basic_block
espnet2.asr.encoder.avhubert_encoder.downsample_basic_block_v2
espnet2.asr.encoder.avhubert_encoder.FairseqAVHubertEncoder
espnet2.asr.encoder.avhubert_encoder.GradMultiply
espnet2.asr.encoder.avhubert_encoder.index_put
espnet2.asr.encoder.avhubert_encoder.is_xla_tensor
espnet2.asr.encoder.avhubert_encoder.ResEncoder
espnet2.asr.encoder.avhubert_encoder.ResNet
espnet2.asr.encoder.avhubert_encoder.SamePad
espnet2.asr.encoder.avhubert_encoder.SubModel
espnet2.asr.encoder.avhubert_encoder.time_masking
espnet2.asr.encoder.beats_encoder.BeatsConfig
espnet2.asr.encoder.beats_encoder.BeatsEncoder
espnet2.asr.encoder.beats_encoder.gelu
espnet2.asr.encoder.beats_encoder.gelu_accurate
espnet2.asr.encoder.beats_encoder.get_activation_fn
espnet2.asr.encoder.beats_encoder.GLU_Linear
espnet2.asr.encoder.beats_encoder.init_bert_params
espnet2.asr.encoder.beats_encoder.MultiheadAttention
espnet2.asr.encoder.beats_encoder.quant_noise
espnet2.asr.encoder.beats_encoder.Swish
espnet2.asr.encoder.beats_encoder.TransformerSentenceEncoderLayer
espnet2.asr.encoder.branchformer_encoder.BranchformerEncoder
espnet2.asr.encoder.branchformer_encoder.BranchformerEncoderLayer
espnet2.asr.encoder.conformer_encoder.ConformerEncoder
espnet2.asr.encoder.contextual_block_conformer_encoder.ContextualBlockConformerEncoder
espnet2.asr.encoder.contextual_block_transformer_encoder.ContextualBlockTransformerEncoder
espnet2.asr.encoder.e_branchformer_ctc_encoder.EBranchformerCTCEncoder
espnet2.asr.encoder.e_branchformer_ctc_encoder.EBranchformerEncoderLayer
espnet2.asr.encoder.e_branchformer_encoder.EBranchformerEncoder
espnet2.asr.encoder.hubert_encoder.download_hubert
espnet2.asr.encoder.hubert_encoder.FairseqHubertEncoder
espnet2.asr.encoder.hubert_encoder.FairseqHubertPretrainEncoder
espnet2.asr.encoder.hubert_encoder.TorchAudioHuBERTPretrainEncoder
espnet2.asr.encoder.hugging_face_transformers_encoder.HuggingFaceTransformersEncoder
espnet2.asr.encoder.linear_encoder.LinearEncoder
espnet2.asr.encoder.longformer_encoder.LongformerEncoder
espnet2.asr.encoder.multiconvformer_encoder.MultiConvConformerEncoder
espnet2.asr.encoder.rnn_encoder.RNNEncoder
espnet2.asr.encoder.transformer_encoder_multispkr.TransformerEncoder
espnet2.asr.encoder.vgg_rnn_encoder.VGGRNNEncoder
espnet2.asr.encoder.wav2vec2_encoder.download_w2v
espnet2.asr.encoder.wav2vec2_encoder.FairSeqWav2Vec2Encoder
espnet2.asr.encoder.whisper_encoder.OpenAIWhisperEncoder
espnet2.asr.espnet_model.ESPnetASRModel
espnet2.asr.frontend.abs_frontend.AbsFrontend
espnet2.asr.frontend.asteroid_frontend.AsteroidFrontend
espnet2.asr.frontend.cnn.CNNFrontend
espnet2.asr.frontend.cnn.ConvLayerBlock
espnet2.asr.frontend.cnn.dim_1_layer_norm
espnet2.asr.frontend.cnn.Dim1LayerNorm
espnet2.asr.frontend.cnn.TransposedLayerNorm
espnet2.asr.frontend.default.DefaultFrontend
espnet2.asr.frontend.espnet_ssl.ESPnetSSLFrontend
espnet2.asr.frontend.espnet_ssl.Featureizer
espnet2.asr.frontend.fused.FusedFrontends
espnet2.asr.frontend.huggingface.HuggingFaceFrontend
espnet2.asr.frontend.melspec_torch.MelSpectrogramTorch
espnet2.asr.frontend.s3prl.S3prlFrontend
espnet2.asr.frontend.whisper.WhisperFrontend
espnet2.asr.frontend.windowing.SlidingWindow
espnet2.asr.layers.cgmlp.ConvolutionalGatingMLP
espnet2.asr.layers.cgmlp.ConvolutionalSpatialGatingUnit
espnet2.asr.layers.fastformer.FastSelfAttention
espnet2.asr.layers.multiconv_cgmlp.MultiConvolutionalGatingMLP
espnet2.asr.layers.multiconv_cgmlp.MultiConvolutionalSpatialGatingUnit
espnet2.asr.maskctc_model.MaskCTCInference
espnet2.asr.maskctc_model.MaskCTCModel
espnet2.asr.partially_AR_model.PartiallyARInference
espnet2.asr.pit_espnet_model.PITLossWrapper
espnet2.asr.postencoder.abs_postencoder.AbsPostEncoder
espnet2.asr.postencoder.hugging_face_transformers_postencoder.HuggingFaceTransformersPostEncoder
espnet2.asr.postencoder.length_adaptor_postencoder.LengthAdaptorPostEncoder
espnet2.asr.preencoder.abs_preencoder.AbsPreEncoder
espnet2.asr.preencoder.linear.LinearProjection
espnet2.asr.preencoder.sinc.LightweightSincConvs
espnet2.asr.preencoder.sinc.SpatialDropout
espnet2.asr.specaug.abs_specaug.AbsSpecAug
espnet2.asr.specaug.specaug.SpecAug
espnet2.asr.state_spaces.attention.MultiHeadedAttention
espnet2.asr.state_spaces.base.SequenceIdentity
espnet2.asr.state_spaces.base.SequenceModule
espnet2.asr.state_spaces.base.TransposedModule
espnet2.asr.state_spaces.block.SequenceResidualBlock
espnet2.asr.state_spaces.cauchy.cauchy_mult
espnet2.asr.state_spaces.cauchy.cauchy_mult_keops
espnet2.asr.state_spaces.cauchy.cauchy_mult_torch
espnet2.asr.state_spaces.cauchy.CauchyMultiply
espnet2.asr.state_spaces.cauchy.CauchyMultiplySymmetric
espnet2.asr.state_spaces.components.Activation
espnet2.asr.state_spaces.components.DropoutNd
espnet2.asr.state_spaces.components.get_initializer
espnet2.asr.state_spaces.components.LinearActivation
espnet2.asr.state_spaces.components.Normalization
espnet2.asr.state_spaces.components.ReversibleInstanceNorm1dInput
espnet2.asr.state_spaces.components.ReversibleInstanceNorm1dOutput
espnet2.asr.state_spaces.components.SquaredReLU
espnet2.asr.state_spaces.components.stochastic_depth
espnet2.asr.state_spaces.components.StochasticDepth
espnet2.asr.state_spaces.components.TransposedLinear
espnet2.asr.state_spaces.components.TransposedLN
espnet2.asr.state_spaces.components.TSInverseNormalization
espnet2.asr.state_spaces.components.TSNormalization
espnet2.asr.state_spaces.ff.FF
espnet2.asr.state_spaces.model.SequenceModel
espnet2.asr.state_spaces.pool.DownAvgPool
espnet2.asr.state_spaces.pool.DownLinearPool
espnet2.asr.state_spaces.pool.DownPool
espnet2.asr.state_spaces.pool.DownPool2d
espnet2.asr.state_spaces.pool.downsample
espnet2.asr.state_spaces.pool.DownSample
espnet2.asr.state_spaces.pool.DownSpectralPool
espnet2.asr.state_spaces.pool.UpPool
espnet2.asr.state_spaces.pool.upsample
espnet2.asr.state_spaces.pool.UpSample
espnet2.asr.state_spaces.residual.Affine
espnet2.asr.state_spaces.residual.DecayResidual
espnet2.asr.state_spaces.residual.Feedforward
espnet2.asr.state_spaces.residual.Highway
espnet2.asr.state_spaces.residual.Residual
espnet2.asr.state_spaces.s4.combination
espnet2.asr.state_spaces.s4.dplr
espnet2.asr.state_spaces.s4.get_logger
espnet2.asr.state_spaces.s4.nplr
espnet2.asr.state_spaces.s4.OptimModule
espnet2.asr.state_spaces.s4.power
espnet2.asr.state_spaces.s4.rank_correction
espnet2.asr.state_spaces.s4.rank_zero_only
espnet2.asr.state_spaces.s4.S4
espnet2.asr.state_spaces.s4.SSKernel
espnet2.asr.state_spaces.s4.SSKernelDiag
espnet2.asr.state_spaces.s4.SSKernelNPLR
espnet2.asr.state_spaces.s4.ssm
espnet2.asr.state_spaces.s4.transition
espnet2.asr.state_spaces.utils.extract_attrs_from_obj
espnet2.asr.state_spaces.utils.get_class
espnet2.asr.state_spaces.utils.instantiate
espnet2.asr.state_spaces.utils.is_dict
espnet2.asr.state_spaces.utils.is_list
espnet2.asr.state_spaces.utils.omegaconf_filter_keys
espnet2.asr.state_spaces.utils.to_dict
espnet2.asr.state_spaces.utils.to_list
espnet2.asr.transducer.beam_search_transducer_streaming.BeamSearchTransducerStreaming
espnet2.asr.transducer.beam_search_transducer.BeamSearchTransducer
espnet2.asr.transducer.beam_search_transducer.ExtendedHypothesis
espnet2.asr.transducer.beam_search_transducer.Hypothesis
espnet2.asr.transducer.error_calculator.ErrorCalculatorTransducer
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank._MultiblankRNNTNumba
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank._RNNTNumba
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.certify_inputs
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.check_contiguous
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.check_dim
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.check_type
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.multiblank_rnnt_loss
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.MultiblankRNNTLossNumba
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.rnnt_loss
espnet2.asr.transducer.rnnt_multi_blank.rnnt_multi_blank.RNNTLossNumba
espnet2.asr.transducer.rnnt_multi_blank.rnnt.multiblank_rnnt_loss_gpu
espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_cpu
espnet2.asr.transducer.rnnt_multi_blank.rnnt.rnnt_loss_gpu
espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.CPURNNT
espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.CpuRNNT_index
espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.CpuRNNT_metadata
espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.log_sum_exp
espnet2.asr.transducer.rnnt_multi_blank.utils.cpu_utils.cpu_rnnt.LogSoftmaxGradModification
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.compute_alphas_kernel
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.compute_betas_kernel
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.compute_grad_kernel
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.compute_multiblank_alphas_kernel
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.compute_multiblank_betas_kernel
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.compute_multiblank_grad_kernel
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt_kernel.logp
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt.GPURNNT
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.gpu_rnnt.MultiblankGPURNNT
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.reduce.CTAReduce
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.reduce.I_Op
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.reduce.R_Op
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.reduce.reduce_exp
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.reduce.reduce_max
espnet2.asr.transducer.rnnt_multi_blank.utils.cuda_utils.reduce.ReduceHelper
espnet2.asr.transducer.rnnt_multi_blank.utils.global_constants.dtype
espnet2.asr.transducer.rnnt_multi_blank.utils.global_constants.RNNTStatus
espnet2.asr.transducer.rnnt_multi_blank.utils.global_constants.threads_per_block
espnet2.asr.transducer.rnnt_multi_blank.utils.global_constants.warp_size
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.add
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.compute_costs_data
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.copy_data_1d
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.div_up
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.exponential
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.flatten_tensor
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.get_workspace_size
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.identity
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.log_plus
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.maximum
espnet2.asr.transducer.rnnt_multi_blank.utils.rnnt_helper.negate
Next
Asr Transducer