espnet2.svs.singing_tacotron.encoder.Duration_Encoder

Less than 1 minute

espnet2.svs.singing_tacotron.encoder.Duration_Encoder

class espnet2.svs.singing_tacotron.encoder.Duration_Encoder(idim, embed_dim=512, dropout_rate=0.5, padding_idx=0)

Bases: Module

Duration_Encoder module of Spectrogram prediction network.

This is a module of encoder of Spectrogram prediction network in Singing-Tacotron, This is the encoder which converts the sequence of durations and tempo features into a transition token.

END-TO-END SINGING VOICE SYNTHESIS`: : https://arxiv.org/abs/2202.07907

Initialize Singing-Tacotron encoder module.

Parameters:
- idim (int)
- embed_dim (int , optional)
- dropout_rate (float , optional)

forward(xs)

Calculate forward propagation.

Parameters:xs (Tensor) – Batch of the duration sequence.(B, Tmax, feature_len)
Returns: Batch of the sequences of transition token (B, Tmax, 1). LongTensor: Batch of lengths of each sequence (B,)
Return type: Tensor

inference(x)

Inference.