espnet.nets.pytorch_backend.fastspeech.length_regulator.LengthRegulator
Less than 1 minute
espnet.nets.pytorch_backend.fastspeech.length_regulator.LengthRegulator
class espnet.nets.pytorch_backend.fastspeech.length_regulator.LengthRegulator(pad_value=0.0)
Bases: Module
Length regulator module for feed-forward Transformer.
This is a module of length regulator described in FastSpeech: Fast, Robust and Controllable Text to Speech. The length regulator expands char or phoneme-level embedding features to frame-level by repeating each feature based on the corresponding predicted durations.
Initilize length regulator module.
- Parameters:pad_value (float , optional) – Value used for padding.
forward(xs, ds, alpha=1.0)
Calculate forward propagation.
- Parameters:
- xs (Tensor) – Batch of sequences of char or phoneme embeddings (B, Tmax, D).
- ds (LongTensor) – Batch of durations of each frame (B, T).
- alpha (float , optional) – Alpha value to control speed of speech.
- Returns: replicated input tensor based on durations (B, T*, D).
- Return type: Tensor