VisualTransformer

class lib.model.networks.clip.VisualTransformer(input_resolution: int, patch_size: int, width: int, num_layers: int, heads: int, output_dim: int, name: str = 'VisualTransformer')

Bases: object

A class representing a Visual Transformer model for image classification tasks.

Parameters:
  • input_resolution (int) – The input resolution of the images.

  • patch_size (int) – The size of the patches to be extracted from the images.

  • width (int) – The dimension of the input and output vectors.

  • num_layers (int) – The number of layers in the Transformer.

  • heads (int) – The number of attention heads.

  • output_dim (int) – The dimension of the output vector.

  • name (str, optional) – The name of the Visual Transformer model, Default: “VisualTransformer”.

__call__() :class:`keras.models.Model`:

Builds and returns the Visual Transformer model.

Return type:

Model

Methods Summary

__call__()

Builds and returns the Visual Transformer model.

Methods Documentation

__call__() Model

Builds and returns the Visual Transformer model.

Returns:

The Visual Transformer model.

Return type:

keras.models.Model

__call__() Model

Builds and returns the Visual Transformer model.

Returns:

The Visual Transformer model.

Return type:

keras.models.Model