VisualTransformer

class lib.model.networks.clip.VisualTransformer(input_resolution: int, patch_size: int, width: int, num_layers: int, heads: int, output_dim: int, name: str = 'VisualTransformer')

Bases: object

A class representing a Visual Transformer model for image classification tasks.

Parameters:

input_resolution (int) – The input resolution of the images.
patch_size (int) – The size of the patches to be extracted from the images.
width (int) – The dimension of the input and output vectors.
num_layers (int) – The number of layers in the Transformer.
heads (int) – The number of attention heads.
output_dim (int) – The dimension of the output vector.
name (str, optional) – The name of the Visual Transformer model, Default: “VisualTransformer”.

__call__() → :class:`keras.models.Model`:

Builds and returns the Visual Transformer model.

Return type:: Model

Methods Summary

__call__()

Builds and returns the Visual Transformer model.

Methods Documentation

__call__() → Model

Builds and returns the Visual Transformer model.

Returns:: The Visual Transformer model.
Return type:: keras.models.Model

__call__() → Model

Builds and returns the Visual Transformer model.

Returns:: The Visual Transformer model.
Return type:: keras.models.Model