Transformer

class lib.model.networks.clip.Transformer(width: int, num_layers: int, heads: int, attn_mask: KerasTensor = None, name: str = 'transformer')

Bases: object

A class representing a Transformer model with attention mechanism and residual connections.

Parameters:
  • width (int) – The dimension of the input and output vectors.

  • num_layers (int) – The number of layers in the Transformer.

  • heads (int) – The number of attention heads.

  • attn_mask (keras.KerasTensor, optional) – The attention mask, by default None.

  • name (str, optional) – The name of the Transformer model, by default “transformer”.

__call__() :class:`keras.models.Model`:

Calls the Transformer layers.

Parameters:

inputs (KerasTensor)

Return type:

KerasTensor

Methods Summary

__call__(inputs)

Call the Transformer layers

residual_attention_block(inputs, key_dim, ...)

Call the residual attention block

Methods Documentation

__call__(inputs: KerasTensor) KerasTensor

Call the Transformer layers

Parameters:

inputs (keras.KerasTensor) – The input Tensor

Returns:

The return Tensor

Return type:

keras.KerasTensor

residual_attention_block(inputs: KerasTensor, key_dim: int, num_heads: int, attn_mask: KerasTensor, name: str = 'ResidualAttentionBlock') KerasTensor

Call the residual attention block

Parameters:
  • inputs (keras.KerasTensor) – The input Tensor

  • key_dim (int) – key dimension per head for MultiHeadAttention

  • num_heads (int) – Number of heads for MultiHeadAttention

  • attn_mask (keras.KerasTensor, optional) – Default: None

  • name (str, optional) – The name for the layer. Default: “ResidualAttentionBlock”

Returns:

The return Tensor

Return type:

keras.KerasTensor

__call__(inputs: KerasTensor) KerasTensor

Call the Transformer layers

Parameters:

inputs (keras.KerasTensor) – The input Tensor

Returns:

The return Tensor

Return type:

keras.KerasTensor

residual_attention_block(inputs: KerasTensor, key_dim: int, num_heads: int, attn_mask: KerasTensor, name: str = 'ResidualAttentionBlock') KerasTensor

Call the residual attention block

Parameters:
  • inputs (keras.KerasTensor) – The input Tensor

  • key_dim (int) – key dimension per head for MultiHeadAttention

  • num_heads (int) – Number of heads for MultiHeadAttention

  • attn_mask (keras.KerasTensor, optional) – Default: None

  • name (str, optional) – The name for the layer. Default: “ResidualAttentionBlock”

Returns:

The return Tensor

Return type:

keras.KerasTensor