espnet.nets.pytorch_backend.rnn.attentions.AttMultiHeadAdd

Less than 1 minute

class espnet.nets.pytorch_backend.rnn.attentions.AttMultiHeadAdd(eprojs, dunits, aheads, att_dim_k, att_dim_v, han_mode=False)

Bases: Module

Multi head additive attention.

Reference: Attention is all you need : (https://arxiv.org/abs/1706.03762)

This attention is multi head attention using additive attention for each head.

Parameters:
- eprojs (int) – # projection-units of encoder
- dunits (int) – # units of decoder
- aheads (int) – # heads of multi head attention
- att_dim_k (int) – dimension k in multi head attention
- att_dim_v (int) – dimension v in multi head attention
- han_mode (bool) – flag to swith on mode of hierarchical attention and not store pre_compute_k and pre_compute_v

Initialize AttMultiHeadAdd.

forward(enc_hs_pad, enc_hs_len, dec_z, att_prev, **kwargs)

Calculate AttMultiHeadAdd forward propagation.

Parameters:
- enc_hs_pad (torch.Tensor) – padded encoder hidden state (B x T_max x D_enc)
- enc_hs_len (list) – padded encoder hidden state length (B)
- dec_z (torch.Tensor) – decoder hidden state (B x D_dec)
- att_prev (torch.Tensor) – dummy (does not use)
Returns: attention weighted encoder state (B, D_enc)
Return type: torch.Tensor
Returns: list of previous attention weight (B x T_max) * aheads
Return type: list

reset()

Reset states.