espnet.nets.pytorch_backend.fastspeech.length_regulator.LengthRegulator

Less than 1 minute

espnet.nets.pytorch_backend.fastspeech.length_regulator.LengthRegulator

class espnet.nets.pytorch_backend.fastspeech.length_regulator.LengthRegulator(pad_value=0.0)

Bases: Module

Length regulator module for feed-forward Transformer.

This is a module of length regulator described in FastSpeech: Fast, Robust and Controllable Text to Speech. The length regulator expands char or phoneme-level embedding features to frame-level by repeating each feature based on the corresponding predicted durations.

Initilize length regulator module.

Parameters:pad_value (float , optional) – Value used for padding.

forward(xs, ds, alpha=1.0)

Calculate forward propagation.

Parameters:
- xs (Tensor) – Batch of sequences of char or phoneme embeddings (B, Tmax, D).
- ds (LongTensor) – Batch of durations of each frame (B, T).
- alpha (float , optional) – Alpha value to control speed of speech.
Returns: replicated input tensor based on durations (B, T*, D).
Return type: Tensor