espnet2.asr.encoder.branchformer_encoder.BranchformerEncoderLayer

Less than 1 minute

espnet2.asr.encoder.branchformer_encoder.BranchformerEncoderLayer

class espnet2.asr.encoder.branchformer_encoder.BranchformerEncoderLayer(size: int, attn: Module | None, cgmlp: Module | None, dropout_rate: float, merge_method: str, cgmlp_weight: float = 0.5, attn_branch_drop_rate: float = 0.0, stochastic_depth_rate: float = 0.0)

Bases: Module

Branchformer encoder layer module.

Parameters:
- size (int) – model dimension
- attn – standard self-attention or efficient attention, optional
- cgmlp – ConvolutionalGatingMLP, optional
- dropout_rate (float) – dropout probability
- merge_method (str) – concat, learned_ave, fixed_ave
- cgmlp_weight (float) – weight of the cgmlp branch, between 0 and 1, used if merge_method is fixed_ave
- attn_branch_drop_rate (float) – probability of dropping the attn branch, used if merge_method is learned_ave
- stochastic_depth_rate (float) – stochastic depth probability

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x_input, mask, cache=None)

Compute encoded features.

Parameters:
- x_input (Union *[*Tuple , torch.Tensor ]) – Input tensor w/ or w/o pos emb.
  - w/ pos emb: Tuple of tensors [(#batch, time, size), (1, time, size)].
  - w/o pos emb: Tensor (#batch, time, size).
- mask (torch.Tensor) – Mask tensor for the input (#batch, 1, time).
- cache (torch.Tensor) – Cache tensor of the input (#batch, time - 1, size).
Returns: Output tensor (#batch, time, size). torch.Tensor: Mask tensor (#batch, time).
Return type: torch.Tensor