AttentionPool2d

class lib.model.networks.clip.AttentionPool2d(spatial_dim: int, embed_dim: int, num_heads: int, output_dim: int | None = None, name='AttentionPool2d')

Bases: object

An Attention Pooling layer that applies a multi-head self-attention mechanism over a spatial grid of features.

Parameters:
  • spatial_dim (int) – The dimensionality of the spatial grid of features.

  • embed_dim (int) – The dimensionality of the feature embeddings.

  • num_heads (int) – The number of attention heads.

  • output_dim (int) – The output dimensionality of the attention layer. If None, it defaults to embed_dim.

  • name (str) – The name of the layer.

Methods Summary

__call__(inputs)

Performs the attention pooling operation on the input tensor.

Methods Documentation

__call__(inputs: KerasTensor) KerasTensor

Performs the attention pooling operation on the input tensor.

Parameters:

inputs (keras.KerasTensor:) – The input tensor of shape [batch_size, height, width, embed_dim].

Return type:

keras.KerasTensor:: The result of the attention pooling operation

__call__(inputs: KerasTensor) KerasTensor

Performs the attention pooling operation on the input tensor.

Parameters:

inputs (keras.KerasTensor:) – The input tensor of shape [batch_size, height, width, embed_dim].

Return type:

keras.KerasTensor:: The result of the attention pooling operation