Transformer
- class lib.model.networks.clip.Transformer(width: int, num_layers: int, heads: int, attn_mask: KerasTensor = None, name: str = 'transformer')
Bases:
objectA class representing a Transformer model with attention mechanism and residual connections.
- Parameters:
width (int) – The dimension of the input and output vectors.
num_layers (int) – The number of layers in the Transformer.
heads (int) – The number of attention heads.
attn_mask (
keras.KerasTensor, optional) – The attention mask, by default None.name (str, optional) – The name of the Transformer model, by default “transformer”.
- __call__() :class:`keras.models.Model`:
Calls the Transformer layers.
- Parameters:
inputs (KerasTensor)
- Return type:
KerasTensor
Methods Summary
__call__(inputs)Call the Transformer layers
residual_attention_block(inputs, key_dim, ...)Call the residual attention block
Methods Documentation
- __call__(inputs: KerasTensor) KerasTensor
Call the Transformer layers
- Parameters:
inputs (
keras.KerasTensor) – The input Tensor- Returns:
The return Tensor
- Return type:
keras.KerasTensor
- residual_attention_block(inputs: KerasTensor, key_dim: int, num_heads: int, attn_mask: KerasTensor, name: str = 'ResidualAttentionBlock') KerasTensor
Call the residual attention block
- Parameters:
inputs (
keras.KerasTensor) – The input Tensorkey_dim (int) – key dimension per head for MultiHeadAttention
num_heads (int) – Number of heads for MultiHeadAttention
attn_mask (
keras.KerasTensor, optional) – Default:Nonename (str, optional) – The name for the layer. Default: “ResidualAttentionBlock”
- Returns:
The return Tensor
- Return type:
keras.KerasTensor
- __call__(inputs: KerasTensor) KerasTensor
Call the Transformer layers
- Parameters:
inputs (
keras.KerasTensor) – The input Tensor- Returns:
The return Tensor
- Return type:
keras.KerasTensor
- residual_attention_block(inputs: KerasTensor, key_dim: int, num_heads: int, attn_mask: KerasTensor, name: str = 'ResidualAttentionBlock') KerasTensor
Call the residual attention block
- Parameters:
inputs (
keras.KerasTensor) – The input Tensorkey_dim (int) – key dimension per head for MultiHeadAttention
num_heads (int) – Number of heads for MultiHeadAttention
attn_mask (
keras.KerasTensor, optional) – Default:Nonename (str, optional) – The name for the layer. Default: “ResidualAttentionBlock”
- Returns:
The return Tensor
- Return type:
keras.KerasTensor