espnet.nets.chainer_backend.transformer.attention.MultiHeadAttention
Less than 1 minute
espnet.nets.chainer_backend.transformer.attention.MultiHeadAttention
class espnet.nets.chainer_backend.transformer.attention.MultiHeadAttention(n_units, h=8, dropout=0.1, initialW=None, initial_bias=None)
Bases: Chain
Multi Head Attention Layer.
- Parameters:
- n_units (int) – Number of input units.
- h (int) – Number of attention heads.
- dropout (float) – Dropout rate.
- initialW – Initializer to initialize the weight.
- initial_bias – Initializer to initialize the bias.
- h – the number of heads
- n_units – the number of features
- dropout_rate (float) – dropout rate
Initialize MultiHeadAttention.
forward(e_var, s_var=None, mask=None, batch=1)
Core function of the Multi-head attention layer.
- Parameters:
- e_var (chainer.Variable) – Variable of input array.
- s_var (chainer.Variable) – Variable of source array from encoder.
- mask (chainer.Variable) – Attention mask.
- batch (int) – Batch size.
- Returns: Outout of multi-head attention layer.
- Return type: chainer.Variable