espnet2.gan_tts.jets.alignments.average_by_duration
Less than 1 minute
espnet2.gan_tts.jets.alignments.average_by_duration
espnet2.gan_tts.jets.alignments.average_by_duration(ds, xs, text_lengths, feats_lengths)
Average frame-level features into token-level according to durations
- Parameters:
- ds (Tensor) – Batched token duration (B, T_text).
- xs (Tensor) – Batched feature sequences to be averaged (B, T_feats).
- text_lengths (Tensor) – Text length tensor (B,).
- feats_lengths (Tensor) – Feature length tensor (B,).
- Returns: Batched feature averaged according to the token duration (B, T_text).
- Return type: Tensor