espnet.nets.pytorch_backend.transformer.attention.LegacyRelPositionMultiHeadedAttention
Less than 1 minute
espnet.nets.pytorch_backend.transformer.attention.LegacyRelPositionMultiHeadedAttention
class espnet.nets.pytorch_backend.transformer.attention.LegacyRelPositionMultiHeadedAttention(n_head, n_feat, dropout_rate, zero_triu=False)
Bases: MultiHeadedAttention
Multi-Head Attention layer with relative position encoding (old version).
Details can be found in https://github.com/espnet/espnet/pull/2816.
Paper: https://arxiv.org/abs/1901.02860
- Parameters:
- n_head (int) – The number of heads.
- n_feat (int) – The number of features.
- dropout_rate (float) – Dropout rate.
- zero_triu (bool) – Whether to zero the upper triangular part of attention matrix.
Construct an RelPositionMultiHeadedAttention object.
forward(query, key, value, pos_emb, mask)
Compute ‘Scaled Dot Product Attention’ with rel. positional encoding.
- Parameters:
- query (torch.Tensor) – Query tensor (#batch, time1, size).
- key (torch.Tensor) – Key tensor (#batch, time2, size).
- value (torch.Tensor) – Value tensor (#batch, time2, size).
- pos_emb (torch.Tensor) – Positional embedding tensor (#batch, time1, size).
- mask (torch.Tensor) – Mask tensor (#batch, 1, time2) or (#batch, time1, time2).
- Returns: Output tensor (#batch, time1, d_model).
- Return type: torch.Tensor
rel_shift(x)
Compute relative positional encoding.
- Parameters:x (torch.Tensor) – Input tensor (batch, head, time1, time2).
- Returns: Output tensor.
- Return type: torch.Tensor