espnet.nets.pytorch_backend.transducer.conv1d_nets.Conv1d
espnet.nets.pytorch_backend.transducer.conv1d_nets.Conv1d
class espnet.nets.pytorch_backend.transducer.conv1d_nets.Conv1d(idim: int, odim: int, kernel_size: int | Tuple, stride: int | Tuple = 1, dilation: int | Tuple = 1, groups: int | Tuple = 1, bias: bool = True, batch_norm: bool = False, relu: bool = True, dropout_rate: float = 0.0)
Bases: Module
1D convolution module for custom encoder.
- Parameters:
- idim – Input dimension.
- odim – Output dimension.
- kernel_size – Size of the convolving kernel.
- stride – Stride of the convolution.
- dilation – Spacing between the kernel points.
- groups – Number of blocked connections from input channels to output channels.
- bias – Whether to add a learnable bias to the output.
- batch_norm – Whether to use batch normalization after convolution.
- relu – Whether to use a ReLU activation after convolution.
- dropout_rate – Dropout rate.
Construct a Conv1d module object.
create_new_mask(mask: Tensor) → Tensor
Create new mask.
- Parameters:mask – Mask of input sequences. (B, 1, T)
- Returns: Mask of output sequences. (B, 1, sub(T))
- Return type: mask
create_new_pos_embed(pos_embed: Tensor) → Tensor
Create new positional embedding vector.
- Parameters:pos_embed – Input sequences positional embedding. (B, 2 * (T - 1), D_att)
- Returns: Output sequences positional embedding. : (B, 2 * (sub(T) - 1), D_att)
- Return type: pos_embed
forward(sequence: Tensor | Tuple[Tensor, Tensor], mask: Tensor) → Tuple[Tensor | Tuple[Tensor, Tensor], Tensor]
Forward ConvEncoderLayer module object.
Parameters:
sequence –
Input sequences. (B, T, D_in)
or (B, T, D_in), (B, 2 * (T - 1), D_att)
mask – Mask of input sequences. (B, 1, T)
Returns: Output sequences. : (B, sub(T), D_out) : or (B, sub(T), D_out), (B, 2 * (sub(T) - 1), D_att)
mask: Mask of output sequences. (B, 1, sub(T))
Return type: sequence