espnet2.svs.singing_tacotron.encoder.Duration_Encoder
Less than 1 minute
espnet2.svs.singing_tacotron.encoder.Duration_Encoder
class espnet2.svs.singing_tacotron.encoder.Duration_Encoder(idim, embed_dim=512, dropout_rate=0.5, padding_idx=0)
Bases: Module
Duration_Encoder module of Spectrogram prediction network.
This is a module of encoder of Spectrogram prediction network in Singing-Tacotron, This is the encoder which converts the sequence of durations and tempo features into a transition token.
END-TO-END SINGING VOICE SYNTHESIS`: : https://arxiv.org/abs/2202.07907
Initialize Singing-Tacotron encoder module.
- Parameters:
- idim (int)
- embed_dim (int , optional)
- dropout_rate (float , optional)
forward(xs)
Calculate forward propagation.
- Parameters:xs (Tensor) – Batch of the duration sequence.(B, Tmax, feature_len)
- Returns: Batch of the sequences of transition token (B, Tmax, 1). LongTensor: Batch of lengths of each sequence (B,)
- Return type: Tensor
inference(x)
Inference.