lib.model package

The Model Package handles interfacing with the neural network backend and holds custom objects.

losses package

lib.model.losses.feature_loss Module

Custom Feature Map Loss Functions for faceswap.py

class lib.model.losses.feature_loss.LPIPSLoss(trunk_network: Literal['alex', 'squeeze', 'vgg16'], trunk_pretrained: bool = True, trunk_eval_mode: bool = True, linear_pretrained: bool = True, linear_eval_mode: bool = True, linear_use_dropout: bool = True, lpips: bool = True, spatial_output: bool = True, normalize: bool = True, ret_per_layer: bool = False, crop: bool = False, color_order: Literal['bgr', 'rgb'] = 'bgr')

LPIPS Loss Function.

A perceptual loss function that uses linear outputs from pretrained CNNs feature layers.

Notes

Channels Last implementation. All trunks implemented from the original paper.

References

https://richzhang.github.io/PerceptualSimilarity/

Parameters:
  • trunk_network (T.Literal['alex', 'squeeze', 'vgg16']) – The name of the trunk network to use. One of “alex”, “squeeze” or “vgg16”

  • trunk_pretrained (bool) – True Load the imagenet pretrained weights for the trunk network. False randomly initialize the trunk network. Default: True

  • trunk_eval_mode (bool) – True for running inference on the trunk network (standard mode), False for training the trunk network. Default: True

  • linear_pretrained (bool) – True loads the pretrained weights for the linear network layers. False randomly initializes the layers. Default: True

  • linear_eval_mode (bool) – True for running inference on the linear network (standard mode), False for training the linear network. Default: True

  • linear_use_dropout (bool) – True if a dropout layer should be used in the Linear network otherwise False. Default: True

  • lpips (bool) – True to use linear network on top of the trunk network. False to just average the output from the trunk network. Default True

  • spatial_output (bool) – True output the loss in the spatial domain (i.e. as a grayscale tensor of height and width of the input image). Bool reduce the spatial dimensions for loss calculation. Default: True

  • normalize (bool) – True if the input Tensor needs to be normalized from the 0. to 1. range to the -1. to 1. range. Default: True

  • ret_per_layer (bool) – True to return the loss value per feature output layer otherwise False. Default: False

  • crop (bool) – Crop the zero-padded borders from the feature maps. Can help reduce moire pattern. Default: False

  • color_order (T.Literal['bgr', 'rgb']) – The RGB/BGR order of the input images

forward(y_true: Tensor, y_pred: Tensor) Tensor | tuple[Tensor, list[Tensor]]

Perform the LPIPS Loss Function.

Parameters:
  • y_true (Tensor) – The ground truth batch of images

  • y_pred (Tensor) – The predicted batch of images

Return type:

The final loss value for each item in the batch

class lib.model.losses.feature_loss.NetInfo(model_id: int = 0, model_name: str = '', net: Callable | None = None, outputs: list[str] | list[int] = <factory>, pad_amount: list[int] | int = 0)

Data class for holding information about Trunk and Linear Layer nets.

Parameters:
  • model_id (int) – The model ID for the model stored in the deepfakes Model repo

  • model_name (str) – The filename of the decompressed model/weights file

  • net (Callable | None) – The net definition to load, if any. Default:None

  • outputs (list[str] | list[int]) – For trunk networks the name of the output feature layers. For linear networks the number of input channels to each layer

  • pad_amount (list[int] | int) – For trunk networks, the amount of zero padding applied to each feature output

Classes

LPIPSLoss(trunk_network[, trunk_pretrained, ...])

LPIPS Loss Function.

NetInfo(model_id, model_name, net, outputs, ...)

Data class for holding information about Trunk and Linear Layer nets.

Class Inheritance Diagram

Inheritance diagram of lib.model.losses.feature_loss.LPIPSLoss, lib.model.losses.feature_loss.NetInfo

lib.model.losses.loss Module

Custom Loss Functions for faceswap.py

class lib.model.losses.loss.FocalFrequencyLoss(alpha: float = 1.0, patch_factor: int = 1, ave_spectrum: bool = False, log_matrix: bool = False, batch_matrix: bool = False, epsilon: float = 1e-06, spatial_output: bool = True)

Focal frequency Loss Function.

Parameters:
  • alpha (float) – Scaling factor of the spectrum weight matrix for flexibility. Default: 1.0

  • patch_factor (int) – Factor to crop image patches for patch-based focal frequency loss. Default: 1

  • ave_spectrum (bool) – True to use mini-batch average spectrum otherwise False. Default: False

  • log_matrix (bool) – True to adjust the spectrum weight matrix by logarithm otherwise False. Default: False

  • batch_matrix (bool) – True to calculate the spectrum weight matrix using batch-based statistics otherwise False. Default: False

  • epsilon (float) – Small epsilon for safer weights scaling division. Default: 1e-6

  • spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

https://arxiv.org/pdf/2012.12821.pdf https://github.com/EndlessSora/focal-frequency-loss

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the Focal Frequency Loss Function.

Parameters:
  • y_true (Tensor) – The ground truth batch of images

  • y_pred (Tensor) – The predicted batch of images

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.GeneralizedLoss(alpha: float = 1.0, beta: float = 0.00392156862745098, spatial_output: bool = True)

Generalized function used to return a large variety of mathematical loss functions.

The primary benefit is a smooth, differentiable version of L1 loss.

References

Barron, J. A General and Adaptive Robust Loss Function - https://arxiv.org/pdf/1701.03077.pdf

Example

>>> a=1.0, x>>c , c=1.0/255.0  # will give a smoothly differentiable version of L1 / MAE loss
>>> a=1.999999 (limit as a->2), beta=1.0/255.0 # will give L2 / RMSE loss
Parameters:
  • alpha (float) – Penalty factor. Larger number give larger weight to large deviations. Default: 1.0

  • beta (float) – Scale factor used to adjust to the input scale (i.e. inputs of mean 1e-4 or 256). Default: 1.0/255.0

  • spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the Generalized Loss Function

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.GradientLoss(spatial_output: bool = True)

Gradient Loss Function.

Calculates the first and second order gradient difference between pixels of an image in the x and y dimensions. These gradients are then compared between the ground truth and the predicted image and the difference is taken. When used as a loss, its minimization will result in predicted images approaching the same level of sharpness / blurriness as the ground truth.

Parameters:

spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

TV+TV2 Regularization with Non-Convex Sparseness-Inducing Penalty for Image Restoration, Chengwu Lu & Hua Huang, 2014 - http://downloads.hindawi.com/journals/mpe/2014/790547.pdf

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the gradient loss function.

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.LInfNorm(*args: Any, **kwargs: Any)

Calculate the L-inf norm as a loss function.

Parameters:
  • args (Any)

  • kwargs (Any)

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the L-inf norm loss function.

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.LaplacianPyramidLoss(max_levels: int = 5, gaussian_size: int = 5, gaussian_sigma: float = 1.0, spatial_output: bool = True)

Laplacian Pyramid Loss Function

Notes

Channels last implementation on square images only.

Parameters:
  • max_levels (int) – The max number of laplacian pyramid levels to use. Default: 5

  • gaussian_size (int) – The size of the gaussian kernel. Default: 5

  • gaussian_sigma (float) – The gaussian sigma. Default: 2.0

  • device – The device to place the variables onto. Default: “cpu”

  • spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

https://arxiv.org/abs/1707.05776 https://github.com/nathanaelbosch/generative-latent-optimization/blob/master/utils.py

forward(y_true: Tensor, y_pred: Tensor) Tensor

Calculate the Laplacian Pyramid Loss.

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.LogCosh(spatial_output: bool = True)

Logarithm of the hyperbolic cosine of the prediction error. Ported from Keras implementation

Parameters:

spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the LogCosh loss function.

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

Classes

FocalFrequencyLoss([alpha, patch_factor, ...])

Focal frequency Loss Function.

GeneralizedLoss([alpha, beta, spatial_output])

Generalized function used to return a large variety of mathematical loss functions.

GradientLoss([spatial_output])

Gradient Loss Function.

LInfNorm(*args, **kwargs)

Calculate the L-inf norm as a loss function.

LaplacianPyramidLoss([max_levels, ...])

Laplacian Pyramid Loss Function

LogCosh([spatial_output])

Logarithm of the hyperbolic cosine of the prediction error.

Class Inheritance Diagram

Inheritance diagram of lib.model.losses.loss.FocalFrequencyLoss, lib.model.losses.loss.GeneralizedLoss, lib.model.losses.loss.GradientLoss, lib.model.losses.loss.LInfNorm, lib.model.losses.loss.LaplacianPyramidLoss, lib.model.losses.loss.LogCosh

lib.model.losses.perceptual_loss Module

Keras implementation of Perceptual Loss Functions for faceswap.py

class lib.model.losses.perceptual_loss.GMSDLoss(spatial_output: bool = True)

Gradient Magnitude Similarity Deviation Loss.

Improved image quality metric over MS-SSIM with easier calculations

Parameters:

spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm https://arxiv.org/ftp/arxiv/papers/1308/1308.3052.pdf

forward(y_true: Tensor, y_pred: Tensor) Tensor

Return the Gradient Magnitude Similarity Deviation Loss.

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.perceptual_loss.MSSIMLoss(max_val: float = 1.0, filter_size: int = 11, filter_sigma: float = 1.5, k1: float = 0.01, k2: float = 0.03, spatial_output: bool = True, power_factors: tuple[float, ...] = (0.0448, 0.2856, 0.3001, 0.2363, 0.1333))

Computes the MS-SSIM between img1 and img2.

This function assumes that img1 and img2 are image batches, i.e. the last three dimensions are [height, width, channels].

Note: The true SSIM is only defined on grayscale. This function does not perform any color-space transform. (If the input is already YUV, then it will compute YUV SSIM average.)

Original paper: Wang, Zhou, Eero P. Simoncelli, and Alan C. Bovik. “Multiscale structural similarity for image quality assessment.” Signals, Systems and Computers, 2004.

Details:
  • 11x11 Gaussian filter of width 1.5 is used.

  • k1 = 0.01, k2 = 0.03 as in the original paper.

The filter is reduced in size if the smallest image is smaller than 11x11.

Parameters:
  • max_val (float) – The dynamic range of the images (i.e., the difference between the maximum the and minimum allowed values). Default 1.0 (0.0 - 1.0)

  • filter_size (int) – Size of gaussian filter. Default: 11

  • filter_sigma (float) – Width of gaussian filter. Default: 1.5

  • k1 (float) – The K1 value. Default: 0.01

  • k2 (float) – The K2 value. Default: 0.03 (SSIM is less sensitivity to K2 for lower values, so it would be better if we took the values in the range of 0 < K2 < 0.4).

  • spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

  • power_factors (tuple[float, ...]) – Iterable of weights for each of the scales. The number of scales used is the length of the list. Index 0 is the unscaled resolution’s weight and each increasing scale corresponds to the image being downsampled by 2. Defaults to the values obtained in the original paper. Default: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333)

  • Reference

  • ---------

  • https (//github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/ops/image_ops_impl.py)

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the MS-SSIM Loss Function.

Parameters:
  • y_true (Tensor) – The ground truth value

  • y_pred (Tensor) – The predicted value

Return type:

The MS-SSIM Loss value

class lib.model.losses.perceptual_loss.SSIMLoss(max_val: float = 1.0, filter_size: int = 11, filter_sigma: float = 1.5, k1: float = 0.01, k2: float = 0.03, spatial_output: bool = True)

Computes SSIM index between img1 and img2.

This function is based on the standard SSIM implementation from: Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing.

Note: The true SSIM is only defined on grayscale. This function does not perform any color-space transform. (If the input is already YUV, then it will compute YUV SSIM average.)

Details:
  • 11x11 Gaussian filter of width 1.5 is used.

  • k1 = 0.01, k2 = 0.03 as in the original paper.

The filter is reduced in size of the image is smaller than 11x11.

Reference

https://github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/ops/image_ops_impl.py

forward(y_true: Tensor, y_pred: Tensor) Tensor

Call the SSIM Loss Function.

Parameters:
  • y_true (Tensor) – The input batch of ground truth images

  • y_pred (Tensor) – The input batch of predicted images

Return type:

The final SSIM for each item in the batch

Parameters:
  • max_val (float)

  • filter_size (int)

  • filter_sigma (float)

  • k1 (float)

  • k2 (float)

  • spatial_output (bool)

Classes

GMSDLoss([spatial_output])

Gradient Magnitude Similarity Deviation Loss.

MSSIMLoss([max_val, filter_size, ...])

Computes the MS-SSIM between img1 and img2.

SSIMLoss([max_val, filter_size, ...])

Computes SSIM index between img1 and img2.

Class Inheritance Diagram

Inheritance diagram of lib.model.losses.perceptual_loss.GMSDLoss, lib.model.losses.perceptual_loss.MSSIMLoss, lib.model.losses.perceptual_loss.SSIMLoss

networks package

lib.model.networks.clip Module

CLIP: https://github.com/openai/CLIP. This implementation only ports the visual transformer part of the model.

class lib.model.networks.clip.AttentionPool2d(spatial_dim: int, embed_dim: int, num_heads: int, output_dim: int | None = None, name='AttentionPool2d')

An Attention Pooling layer that applies a multi-head self-attention mechanism over a spatial grid of features.

Parameters:
  • spatial_dim (int) – The dimensionality of the spatial grid of features.

  • embed_dim (int) – The dimensionality of the feature embeddings.

  • num_heads (int) – The number of attention heads.

  • output_dim (int) – The output dimensionality of the attention layer. If None, it defaults to embed_dim.

  • name (str) – The name of the layer.

__call__(inputs: KerasTensor) KerasTensor

Performs the attention pooling operation on the input tensor.

Parameters:

inputs (keras.KerasTensor:) – The input tensor of shape [batch_size, height, width, embed_dim].

Return type:

keras.KerasTensor:: The result of the attention pooling operation

class lib.model.networks.clip.Bottleneck(inplanes: int, planes: int, stride: int = 1, name: str = 'bottleneck')

A ResNet bottleneck block that performs a sequence of convolutions, batch normalization, and ReLU activation operations on an input tensor.

Parameters:
  • inplanes (int) – The number of input channels.

  • planes (int) – The number of output channels.

  • stride (int, optional) – The stride of the bottleneck block. Default: 1

  • name (str, optional) – The name of the bottleneck block. Default: “bottleneck”

__call__(inputs: KerasTensor) KerasTensor

Performs the forward pass for a Bottleneck block.

All conv layers have stride 1. an avgpool is performed after the second convolution when stride > 1

Parameters:

inputs (keras.KerasTensor) – The input tensor to the Bottleneck block.

Returns:

The result of the forward pass through the Bottleneck block.

Return type:

keras.KerasTensor

expansion = 4

The factor by which the number of input channels is expanded to get the number of output channels.

Type:

int

class lib.model.networks.clip.ClassEmbedding(*args, **kwargs)

Trainable Class Embedding layer

Parameters:
  • input_shape (tuple[int, ...])

  • scale (int)

  • name (str)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Get the Class Embedding layer

Parameters:

inputs (keras.KerasTensor) – Input tensor to the embedding layer

Returns:

The class embedding layer shaped for the input tensor

Return type:

keras.KerasTensor

class lib.model.networks.clip.EmbeddingLayer(*args, **kwargs)

Parent class for trainable embedding variables

Parameters:
  • input_shape (tuple[int, ...]) – The shape of the variable

  • scale (int) – Amount to scale the random initialization by

  • name (str) – The name of the layer

  • dtype (str, optional) – The datatype for the layer. Mixed precision can mess up the embeddings. Default: “float32”

build(input_shape: tuple[int, ...]) None

Add the weights

Parameters:

input_shape (tuple[int, ...) – The input shape of the incoming tensor

Return type:

None

get_config() dict[str, Any]

Get the config dictionary for the layer

Returns:

The config dictionary for the layer

Return type:

dict[str, Any]

class lib.model.networks.clip.ModifiedResNet(input_resolution: int, width: int, layer_config: tuple[int, int, int, int], output_dim: int, heads: int, name='ModifiedResNet')

A ResNet class that is similar to torchvision’s but contains the following changes:

  • There are now 3 “stem” convolutions as opposed to 1, with an average pool instead of a max pool.

  • Performs anti-aliasing strided convolutions, where an avgpool is prepended to convolutions with stride > 1

  • The final pooling layer is a QKV attention instead of an average pool

Parameters:
  • input_resolution (int) – The input resolution of the model. Default is 224.

  • width (int) – The width of the model. Default is 64.

  • layer_config (list) – A list containing the number of Bottleneck blocks for each layer.

  • output_dim (int) – The output dimension of the model.

  • heads (int) – The number of heads for the QKV attention.

  • name (str) – The name of the model. Default is “ModifiedResNet”.

__call__() Model

Implements the forward pass of the ModifiedResNet model.

Returns:

The modified resnet model.

Return type:

keras.models.Model

class lib.model.networks.clip.PositionalEmbedding(*args, **kwargs)

Trainable Positional Embedding layer

Parameters:
  • input_shape (tuple[int, ...])

  • scale (int)

  • name (str)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Get the Positional Embedding layer

Parameters:

inputs (keras.KerasTensor) – Input tensor to the embedding layer

Returns:

The positional embedding layer shaped for the input tensor

Return type:

keras.KerasTensor

class lib.model.networks.clip.Projection(*args, **kwargs)

Trainable Projection Embedding Layer

Parameters:
  • input_shape (tuple[int, ...])

  • scale (int)

  • name (str)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Get the Projection layer

Parameters:

inputs (keras.KerasTensor) – Input tensor to the embedding layer

Returns:

The Projection layer expanded to the batch dimension and transposed for matmul

Return type:

keras.KerasTensor

class lib.model.networks.clip.Transformer(width: int, num_layers: int, heads: int, attn_mask: KerasTensor = None, name: str = 'transformer')

A class representing a Transformer model with attention mechanism and residual connections.

Parameters:
  • width (int) – The dimension of the input and output vectors.

  • num_layers (int) – The number of layers in the Transformer.

  • heads (int) – The number of attention heads.

  • attn_mask (keras.KerasTensor, optional) – The attention mask, by default None.

  • name (str, optional) – The name of the Transformer model, by default “transformer”.

__call__() :class:`keras.models.Model`:

Calls the Transformer layers.

Parameters:

inputs (KerasTensor)

Return type:

KerasTensor

__call__(inputs: KerasTensor) KerasTensor

Call the Transformer layers

Parameters:

inputs (keras.KerasTensor) – The input Tensor

Returns:

The return Tensor

Return type:

keras.KerasTensor

residual_attention_block(inputs: KerasTensor, key_dim: int, num_heads: int, attn_mask: KerasTensor, name: str = 'ResidualAttentionBlock') KerasTensor

Call the residual attention block

Parameters:
  • inputs (keras.KerasTensor) – The input Tensor

  • key_dim (int) – key dimension per head for MultiHeadAttention

  • num_heads (int) – Number of heads for MultiHeadAttention

  • attn_mask (keras.KerasTensor, optional) – Default: None

  • name (str, optional) – The name for the layer. Default: “ResidualAttentionBlock”

Returns:

The return Tensor

Return type:

keras.KerasTensor

class lib.model.networks.clip.ViT(name: Literal['RN50', 'RN101', 'RN50x4', 'RN50x16', 'RN50x64', 'ViT-B-16', 'ViT-B-32', 'ViT-L-14', 'ViT-L-14-336px', 'FaRL-B-16-16', 'FaRL-B-16-64'], input_size: int | None = None, load_weights: bool = False)

Visiual Transform from CLIP

A Convolutional Language-Image Pre-Training (CLIP) model that encodes images and text into a shared latent space.

Reference

https://arxiv.org/abs/2103.00020

param name:

“ViT-B-16”, “ViT-L-14”, “ViT-L-14-336px”, “FaRL-B_16-64”] The model configuration to use

type name:

[“RN50”, “RN101”, “RN50x4”, “RN50x16”, “RN50x64”, “ViT-B-32”,

param input_size:

The required resolution size for the model. None for default preset size

type input_size:

int, optional

param load_weights:

True to load pretrained weights. Default: False

type load_weights:

bool, optional

__call__() Model

Get the configured ViT model

Returns:

The requested Visual Transformer model

Return type:

keras.models.Model

Parameters:
  • name (TypeModels)

  • input_size (int | None)

  • load_weights (bool)

class lib.model.networks.clip.ViTConfig(embed_dim: int, resolution: int, layer_conf: int | tuple[int, int, int, int], width: int, patch: int, git_id: int = 0)

Configuration settings for ViT

Parameters:
  • embed_dim (int) – Dimensionality of the final shared embedding space

  • resolution (int) – Spatial resolution of the input images

  • layer_conf (tuple[int, int, int, int] | int) – Number of layers in the visual encoder, or a tuple of layer configurations for a custom ResNet visual encoder

  • width (int) – Width of the visual encoder layers

  • patch (int) – Size of the patches to be extracted from the images. Only used for Visual encoder.

  • git_id (int, optional) – The id of the model weights file stored in deepfakes_models repo if they exist. Default: 0

class lib.model.networks.clip.VisualTransformer(input_resolution: int, patch_size: int, width: int, num_layers: int, heads: int, output_dim: int, name: str = 'VisualTransformer')

A class representing a Visual Transformer model for image classification tasks.

Parameters:
  • input_resolution (int) – The input resolution of the images.

  • patch_size (int) – The size of the patches to be extracted from the images.

  • width (int) – The dimension of the input and output vectors.

  • num_layers (int) – The number of layers in the Transformer.

  • heads (int) – The number of attention heads.

  • output_dim (int) – The dimension of the output vector.

  • name (str, optional) – The name of the Visual Transformer model, Default: “VisualTransformer”.

__call__() :class:`keras.models.Model`:

Builds and returns the Visual Transformer model.

Return type:

Model

__call__() Model

Builds and returns the Visual Transformer model.

Returns:

The Visual Transformer model.

Return type:

keras.models.Model

Classes

AttentionPool2d(spatial_dim, embed_dim, ...)

An Attention Pooling layer that applies a multi-head self-attention mechanism over a spatial grid of features.

Bottleneck(inplanes, planes[, stride, name])

A ResNet bottleneck block that performs a sequence of convolutions, batch normalization, and ReLU activation operations on an input tensor.

ClassEmbedding(*args, **kwargs)

Trainable Class Embedding layer

EmbeddingLayer(*args, **kwargs)

Parent class for trainable embedding variables

ModifiedResNet(input_resolution, width, ...)

A ResNet class that is similar to torchvision's but contains the following changes:

PositionalEmbedding(*args, **kwargs)

Trainable Positional Embedding layer

Projection(*args, **kwargs)

Trainable Projection Embedding Layer

Transformer(width, num_layers, heads[, ...])

A class representing a Transformer model with attention mechanism and residual connections.

ViT(name[, input_size, load_weights])

Visiual Transform from CLIP

ViTConfig(embed_dim, resolution, layer_conf, ...)

Configuration settings for ViT

VisualTransformer(input_resolution, ...[, name])

A class representing a Visual Transformer model for image classification tasks.

Class Inheritance Diagram

Inheritance diagram of lib.model.networks.clip.AttentionPool2d, lib.model.networks.clip.Bottleneck, lib.model.networks.clip.ClassEmbedding, lib.model.networks.clip.EmbeddingLayer, lib.model.networks.clip.ModifiedResNet, lib.model.networks.clip.PositionalEmbedding, lib.model.networks.clip.Projection, lib.model.networks.clip.Transformer, lib.model.networks.clip.ViT, lib.model.networks.clip.ViTConfig, lib.model.networks.clip.VisualTransformer

lib.model.networks.insightface_resnet Module

InsightFace ResNet (IR) and InsightFace ResNet Squeeze + Excite (IRSE) for inference

From: https://github.com/deepinsight/insightface and https://github.com/HuangYG123/CurricularFace

Released under MIT License

class lib.model.networks.insightface_resnet.BasicBlockIR(in_channels: int, depth: int, stride: int, use_se: bool)

A Basic Block for InsightFace ResNet

Parameters:
  • in_channels (int) – The number of input channels to the layer

  • depth (int) – The depth of the layer

  • stride (int) – The Convolution stride

  • use_se (bool) – True to add squeeze and excite layer

forward(inputs: Tensor) Tensor

Forward pass through the IRNet basic block

Parameters:

inputs (Tensor) – The input to the IRNet Block

Return type:

The output from the IRNet Block

class lib.model.networks.insightface_resnet.BottleneckIR(in_channels: int, depth: int, stride: int, use_se: bool)

Bottleneck for IRNet

Parameters:
  • in_channels (int) – The number of input channels to the layer

  • depth (int) – The depth of the layer

  • stride (int) – The Convolution stride

  • use_se (bool) – True to add squeeze and excite layer

forward(inputs: Tensor) Tensor

Forward pass through the IRNet Bottleneck

Parameters:

inputs (Tensor) – The input to the IRNet Bottleneck

Return type:

The output from the IRNet Bottleneck

class lib.model.networks.insightface_resnet.Flatten(*args: Any, **kwargs: Any)

Flatten layer for IRNet

Parameters:
  • args (Any)

  • kwargs (Any)

forward(inputs: Tensor) Tensor

Flatten the inbound layer

Parameters:

inputs (Tensor) – The input layer to be flattened

Return type:

The flattened input layer

class lib.model.networks.insightface_resnet.IRNet(input_size: Literal[112, 224], block_filters: tuple[int, int, int, int], block_recursions: tuple[int, int, int, int], num_features: int = 512, use_se: bool = False, use_bottleneck: bool = False)

Implementation if InsightFace ResNet with Squeeze + Excite support

Parameters:
  • input_size (Literal[112, 224]) – The input size to the model. Must be 112 or 224

  • block_filters (tuple[int, int, int, int]) – The number of in_channels to each block layer for each pass

  • block_recursions (tuple[int, int, int, int]) – The number of recursions within each block

  • num_features (int) – The number of num_features to output. Default: 512

  • use_se (bool) – True to use Squeeze and Excite. False to use standard IR ResNet. Default: False

  • use_bottleneck (bool) – True to use the Bottleneck block. False to use the Basic block. Default: False

forward(inputs: Tensor) Tensor

Forward pass through IRNet

Parameters:

inputs (Tensor) – The input to IRNet

Return type:

The output from IRNet

class lib.model.networks.insightface_resnet.SEModule(in_channels: int, reduction: int)

Squeeze and Excite Block for IRNet

Parameters:
  • in_channels (int) – The number of input channels

  • reduction (int) – The reduction factor for squeeze and excite

forward(inputs: Tensor) Tensor

Forward pass through the IRNet Squeeze and Excite Block

Parameters:

inputs (Tensor)

Return type:

Tensor

lib.model.networks.insightface_resnet.ir_101(input_size: Literal[112, 224]) IRNet

Obtain an IRNet-101 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_152(input_size: Literal[112, 224]) IRNet

Obtain an IRNet-152 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_18(input_size: Literal[112, 224])

Obtain an IRNet-18 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

lib.model.networks.insightface_resnet.ir_200(input_size: Literal[112, 224]) IRNet

Obtain an IRNet-200 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_34(input_size: Literal[112, 224]) IRNet

Obtain an IRNet-34 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_50(input_size: Literal[112, 224]) IRNet

Obtain an IRNet-50 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_se_101(input_size: Literal[112, 224]) IRNet

Obtain an IRNetSE101 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_se_152(input_size: Literal[112, 224]) IRNet

Obtain an IRNetSE152 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_se_200(input_size: Literal[112, 224]) IRNet

Obtain an IRNetSE200 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

lib.model.networks.insightface_resnet.ir_se_50(input_size: Literal[112, 224]) IRNet

Obtain an IRNetSE50 model

Parameters:

input_size (Literal[112, 224]) – The input size to the model

Return type:

IRNet

Functions

ir_101(input_size)

Obtain an IRNet-101 model

ir_152(input_size)

Obtain an IRNet-152 model

ir_18(input_size)

Obtain an IRNet-18 model

ir_200(input_size)

Obtain an IRNet-200 model

ir_34(input_size)

Obtain an IRNet-34 model

ir_50(input_size)

Obtain an IRNet-50 model

ir_se_101(input_size)

Obtain an IRNetSE101 model

ir_se_152(input_size)

Obtain an IRNetSE152 model

ir_se_200(input_size)

Obtain an IRNetSE200 model

ir_se_50(input_size)

Obtain an IRNetSE50 model

Classes

BasicBlockIR(in_channels, depth, stride, use_se)

A Basic Block for InsightFace ResNet

BottleneckIR(in_channels, depth, stride, use_se)

Bottleneck for IRNet

Flatten(*args, **kwargs)

Flatten layer for IRNet

IRNet(input_size, block_filters, ...[, ...])

Implementation if InsightFace ResNet with Squeeze + Excite support

SEModule(in_channels, reduction)

Squeeze and Excite Block for IRNet

Class Inheritance Diagram

Inheritance diagram of lib.model.networks.insightface_resnet.BasicBlockIR, lib.model.networks.insightface_resnet.BottleneckIR, lib.model.networks.insightface_resnet.Flatten, lib.model.networks.insightface_resnet.IRNet, lib.model.networks.insightface_resnet.SEModule

optimizers package

lib.model.optimizers.adabelief Module

AdaBelief optimizer for Torch

class lib.model.optimizers.adabelief.AdaBelief(params: Iterable, lr: float = 0.001, betas: tuple[float, float] = (0.9, 0.999), eps: float = 1e-16, weight_decay: float = 0.0, amsgrad: bool = False, weight_decouple: bool = True, fixed_decay: bool = False, rectify: bool = True, degenerated_to_sgd: bool = True)

Implements AdaBelief algorithm. Modified from Adam in PyTorch

Parameters:
  • params (Iterable) – Iterable of parameters to optimize or dicts defining parameter groups

  • lr (float) – Learning rate. Default: 1e-3

  • betas (tuple[float, float]) – Coefficients used for computing running averages of gradient and its square. Default: (0.9, 0.999)

  • eps (float) – Term added to the denominator to improve numerical stability. Default: 1e-16

  • weight_decay (float) – Weight decay (L2 penalty). Default: 0

  • amsgrad (bool) – Whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond. Default: False

  • weight_decouple (bool) – If set as True, then the optimizer uses decoupled weight decay as in AdamW. Default: True

  • fixed_decay (bool) – This is used when weight_decouple is set as True. - When fixed_decay == True, the weight decay is performed as W_{new} = W_{old} - W_{old} * decay. - When fixed_decay == False, the weight decay is performed as W_{new} = W_{old} - W_{old} * decay * lr. Note that in this case, the weight decay ratio decreases with learning rate (lr). Default: False

  • rectify (bool) – If set as True, then perform the rectified update similar to RAdam. Default: True

  • degenerated_to_sgd (bool) – If set as True, then perform SGD update when variance of gradient is high. Default: True

  • Reference

  • ---------

  • Optimizer (AdaBelief)

  • gradients (adapting step sizes by the belief in observed)

  • 2020 (NeurIPS)

  • https (//github.com/juntang-zhuang/Adabelief-Optimizer)

reset() None

Reset parameters

Return type:

None

step(closure: Callable | None = None) Tensor

Performs a single optimization step.

Parameters:

closure (Callable | None) – A closure that reevaluates the model and returns the loss. Default: None

Return type:

Tensor

Classes

AdaBelief(params[, lr, betas, eps, ...])

Implements AdaBelief algorithm.

Class Inheritance Diagram

Inheritance diagram of lib.model.optimizers.adabelief.AdaBelief

lib.model.optimizers.lion Module

PyTorch implementation of the Lion optimizer.

class lib.model.optimizers.lion.Lion(params: Iterable, lr: float = 0.0001, betas: tuple[float, float] = (0.9, 0.99), weight_decay: float = 0.0)

Lion optimizer from Google

Parameters:
  • params (Iterable) – Iterable of parameters to optimize or dicts defining parameter groups

  • lr (float) – Learning rate. Default: 1e-4

  • betas (tuple[float, float]) – Coefficients used for computing running averages of gradient and its square. Default: (0.9, 0.99)

  • weight_decay (float) – Weight decay coefficient. Default: 0

  • Reference

  • ---------

  • https (//github.com/google/automl/blob/master/lion/lion_pytorch.py)

step(closure: Callable | None = None) Tensor

Performs a single optimization step.

Parameters:

closure (Callable | None) – A closure that reevaluates the model and returns the loss.

Return type:

The loss

Classes

Lion(params[, lr, betas, weight_decay])

Lion optimizer from Google

Class Inheritance Diagram

Inheritance diagram of lib.model.optimizers.lion.Lion

lib.model.optimizers.keras_legacy Module

Legacy keras Optimizers for weight migration

class lib.model.optimizers.keras_legacy.AdaBelief(*args, **kwargs)

Implementation of the AdaBelief Optimizer

Inherits from: keras.optimizers.Optimizer.

AdaBelief Optimizer is not a placement of the heuristic warmup, the settings should be kept if warmup has already been employed and tuned in the baseline method. You can enable warmup by setting total_steps and warmup_proportion (see examples)

Lookahead (see references) can be integrated with AdaBelief Optimizer, which is announced by Less Wright and the new combined optimizer can also be called “Ranger”. The mechanism can be enabled by using the lookahead wrapper. (See examples)

Parameters:
  • learning_rate (float) – The learning rate.

  • beta_1 (float) – The exponential decay rate for the 1st moment estimates.

  • beta_2 (float) – The exponential decay rate for the 2nd moment estimates.

  • epsilon (float) – A small constant for numerical stability.

  • amsgrad (bool) – Whether to apply AMSGrad variant of this algorithm from the paper “On the Convergence of Adam and beyond”.

  • rectify (bool) – Whether to enable rectification as in RectifiedAdam

  • sma_threshold (float) – The threshold for simple mean average.

  • total_steps (int) – Total number of training steps. Enable warmup by setting a positive value.

  • warmup_proportion (float) – The proportion of increasing steps.

  • min_lr – Minimum learning rate after warmup.

  • name – Name for the operations created when applying gradients. Default: "AdaBeliefOptimizer".

  • **kwargs – Standard Keras Optimizer keyword arguments. Allowed to be (weight_decay, clipnorm, clipvalue, global_clipnorm, use_ema, ema_momentum, ema_overwrite_frequency, loss_scale_factor, gradient_accumulation_steps)

  • min_learning_rate (float)

Examples

>>> from optimizers import AdaBelief
>>> opt = AdaBelief(lr=1e-3)

Example of serialization:

>>> optimizer = AdaBelief(learning_rate=lr_scheduler, weight_decay=wd_scheduler)
>>> config = keras.optimizers.serialize(optimizer)
>>> new_optimizer = keras.optimizers.deserialize(config,
...                                                 custom_objects=dict(AdaBelief=AdaBelief))

Example of warm up:

>>> opt = AdaBelief(lr=1e-3, total_steps=10000, warmup_proportion=0.1, min_lr=1e-5)

In the above example, the learning rate will increase linearly from 0 to lr in 1000 steps, then decrease linearly from lr to min_lr in 9000 steps.

Example of enabling Lookahead:

>>> adabelief = AdaBelief()
>>> ranger = tfa.optimizers.Lookahead(adabelief, sync_period=6, slow_step_size=0.5)

Notes

amsgrad is not described in the original paper. Use it with caution.

References

Juntang Zhuang et al. - AdaBelief Optimizer: Adapting step sizes by the belief in observed gradients - https://arxiv.org/abs/2010.07468.

Original implementation - https://github.com/juntang-zhuang/Adabelief-Optimizer

Michael R. Zhang et.al - Lookahead Optimizer: k steps forward, 1 step back - https://arxiv.org/abs/1907.08610v1

Adapted from https://github.com/juntang-zhuang/Adabelief-Optimizer

BSD 2-Clause License

Copyright (c) 2021, Juntang Zhuang All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

build(variables: list[Variable]) None

Initialize optimizer variables.

AdaBelief optimizer has 3 types of variables: momentums, velocities and velocity_hat (only set when amsgrad is applied),

Parameters:

variables (list[Variable]) – list of model variables to build AdaBelief variables on.

Return type:

None

get_config() dict[str, Any]

Returns the config of the optimizer.

Optimizer configuration for AdaBelief.

Returns:

The optimizer configuration.

Return type:

dict[str, Any]

update_step(gradient: Tensor, variable: Variable, learning_rate: Tensor) None

Update step given gradient and the associated model variable for AdaBelief.

Parameters:
  • gradient (Tensor) – The gradient to update

  • variable (Variable) – The variable to update

  • learning_rate (Tensor) – The learning rate

Return type:

None

Classes

AdaBelief(*args, **kwargs)

Implementation of the AdaBelief Optimizer

Class Inheritance Diagram

Inheritance diagram of lib.model.optimizers.keras_legacy.AdaBelief

model package

lib.model.autoclip Module

Auto clipper for clipping gradients.

class lib.model.autoclip.AutoClipper(clip_percentile: int, history_size: int = 10000)

AutoClip: Adaptive Gradient Clipping for Source Separation Networks

Parameters:
  • clip_percentile (int) – The percentile to clip the gradients at

  • history_size (int) – The number of iterations of data to use to calculate the norm Default: 10000

References

Adapted from: https://github.com/pseeth/autoclip original paper: https://arxiv.org/abs/2007.14469

__call__(parameters: list[Parameter], *args) None

Call the AutoClip function.

Parameters:
  • parameters (list[Parameter]) – The parameters to clip

  • args – Unused but for compatibility

Return type:

None

Classes

AutoClipper(clip_percentile[, history_size])

AutoClip: Adaptive Gradient Clipping for Source Separation Networks


lib.model.backup_restore Module

Functions for backing up, restoring and creating model snapshots.

class lib.model.backup_restore.Backup(model_dir: str, model_name: str)

Performs the back up of models at each save iteration, and the restoring of models from their back up location.

Parameters:
  • model_dir (str) – The folder that contains the model to be backed up

  • model_name (str) – The name of the model that is to be backed up

static backup_model(full_path: str) None

Backup a model file.

The backed up file is saved with the original filename in the original location with .bk appended to the end of the name.

Parameters:

full_path (str) – The full path to a .keras model file or a .json state file

Return type:

None

restore() None

Restores a model from backup.

The original model files are migrated into a folder within the original model folder named <model_name>_archived_<timestamp>. The .bk backup files are then moved to the location of the previously existing model files. Logs that were generated after the the last backup was taken are removed.

Return type:

None

snapshot_models(iterations: int) None

Take a snapshot of the model at the current state and back it up.

The snapshot is a copy of the model folder located in the same root location as the original model file, with the number of iterations appended to the end of the folder name.

Parameters:

iterations (int) – The number of iterations that the model has trained when performing the snapshot.

Return type:

None

Classes

Backup(model_dir, model_name)

Performs the back up of models at each save iteration, and the restoring of models from their back up location.


lib.model.initializers Module

Custom Initializers for faceswap.py

class lib.model.initializers.ConvolutionAware(eps_std: float = 0.05, seed: int | None = None, initialized: bool = False)

Initializer that generates orthogonal convolution filters in the Fourier space. If this initializer is passed a shape that is not 3D or 4D, orthogonal initialization will be used.

Adapted, fixed and optimized from: https://github.com/keras-team/keras-contrib/blob/master/keras_contrib/initializers/convaware.py

Parameters:
  • eps_std (float) – The Standard deviation for the random normal noise used to break symmetry in the inverse Fourier transform. Default: 0.05

  • seed (int | None) – Used to seed the random generator. Default: None

  • initialized (bool) – This should always be set to False. To avoid Keras re-calculating the values every time the model is loaded, this parameter is internally set on first time initialization. Default:False

Return type:

The modified kernel weights

References

Armen Aghajanyan, https://arxiv.org/abs/1702.06295

__call__(shape: list[int] | tuple[int, ...], dtype: str | None = None) Tensor

Call function for the ICNR initializer.

Parameters:
  • shape (list[int] | tuple[int, ...]) – The required shape for the output tensor

  • dtype (str | None) – The data type for the tensor

Return type:

The modified kernel weights

get_config() dict[str, Any]

Return the Convolutional Aware Initializer configuration.

Return type:

The configuration for Convolutional Aware Initialization

class lib.model.initializers.ICNR(initializer: dict[str, Any] | Initializer, scale: int = 2)

ICNR initializer for checkerboard artifact free sub pixel convolution

Parameters:
  • initializer (dict[str, T.Any] | initializers.Initializer) – The initializer used for sub kernels (orthogonal, glorot uniform, etc.)

  • scale (int) – scaling factor of sub pixel convolution (up sampling from 8x8 to 16x16 is scale 2). Default: 2

Return type:

The modified kernel weights

Example

>>> x = conv2d(... weights_initializer=ICNR(initializer=he_uniform(), scale=2))

References

Andrew Aitken et al. Checkerboard artifact free sub-pixel convolution https://arxiv.org/pdf/1707.02937.pdf, https://distill.pub/2016/deconv-checkerboard/ https://gist.github.com/A03ki/2305398458cb8e2155e8e81333f0a965

__call__(shape: list[int] | tuple[int, ...], dtype: str | None = 'float32') Tensor

Returns a tensor object initialized as specified by the initializer.

Parameters:
  • shape (list[int] | tuple[int, ...]) – Shape of the tensor.

  • dtype (str | None) – Optional dtype of the tensor.

Return type:

Tensor

get_config() dict[str, Any]

Return the ICNR Initializer configuration.

Return type:

The configuration for ICNR Initialization

Classes

ConvolutionAware([eps_std, seed, initialized])

Initializer that generates orthogonal convolution filters in the Fourier space.

ICNR(initializer[, scale])

ICNR initializer for checkerboard artifact free sub pixel convolution

Class Inheritance Diagram

Inheritance diagram of lib.model.initializers.ConvolutionAware, lib.model.initializers.ICNR

lib.model.layers Module

Custom Layers for faceswap.py.

class lib.model.layers.GlobalMinPooling2D(*args, **kwargs)

Global minimum pooling operation for spatial data.

Parameters:

data_format (str | None)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

class lib.model.layers.GlobalStdDevPooling2D(*args, **kwargs)

Global standard deviation pooling operation for spatial data.

Parameters:

data_format (str | None)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

class lib.model.layers.KResizeImages(*args, **kwargs)

A custom upscale function that uses keras.backend.resize_images to upsample.

Parameters:
  • size (int or float, optional) – The scale to upsample to. Default: 2

  • interpolation (["nearest", "bilinear"], optional) – The interpolation to use. Default: “nearest”

  • kwargs (dict) – The standard Keras Layer keyword arguments (if any)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Call the upsample layer

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Computes the output shape of the layer.

This is the input shape with size dimensions multiplied by size

Parameters:

input_shape (tuple or list of tuples) – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.

Returns:

An input shape tuple

Return type:

tuple

get_config() dict[str, Any]

Returns the config of the layer.

Returns:

A python dictionary containing the layer configuration

Return type:

dict

class lib.model.layers.L2Normalize(*args, **kwargs)

Normalizes a tensor w.r.t. the L2 norm alongside the specified axis.

Parameters:
  • axis (int) – The axis to perform normalization across

  • kwargs (dict) – The standard Keras Layer keyword arguments (if any)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Compute the output shape based on the input shape.

Parameters:

input_shape (tuple) – The input shape to the layer

Return type:

tuple[int, …]

get_config() dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:

A python dictionary containing the layer configuration

Return type:

dict

class lib.model.layers.PixelShuffler(*args, **kwargs)

PixelShuffler layer for Keras.

This layer requires a Convolution2D prior to it, having output filters computed according to the formula \(filters = k * (scale_factor * scale_factor)\) where k is a user defined number of filters (generally larger than 32) and scale_factor is the up-scaling factor (generally 2).

This layer performs the depth to space operation on the convolution filters, and returns a tensor with the size as defined below.

Notes

In practice, it is useful to have a second convolution layer after the PixelShuffler layer to speed up the learning process. However, if you are stacking multiple PixelShuffler blocks, it may increase the number of parameters greatly, so the Convolution layer after PixelShuffler layer can be removed.

Example

>>> # A standard sub-pixel up-scaling block
>>> x = Convolution2D(256, 3, 3, padding="same", activation="relu")(...)
>>> u = PixelShuffler(size=(2, 2))(x)
[Optional]
>>> x = Convolution2D(256, 3, 3, padding="same", activation="relu")(u)
Parameters:
  • size (tuple, optional) – The (h, w) scaling factor for up-scaling. Default: (2, 2)

  • data_format ([“channels_first”, “channels_last”, None], optional) – The data format for the input. Default: None

  • kwargs (dict) – The standard Keras Layer keyword arguments (if any)

References

https://gist.github.com/t-ae/6e1016cc188104d123676ccef3264981

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int | None, ...]) tuple[int | None, ...]

Computes the output shape of the layer.

Assumes that the layer will be built to match that input shape provided.

Parameters:

input_shape (tuple or list of tuples) – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.

Returns:

An input shape tuple

Return type:

tuple

get_config() dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:

A python dictionary containing the layer configuration

Return type:

dict

class lib.model.layers.QuickGELU(*args, **kwargs)

Applies GELU approximation that is fast but somewhat inaccurate.

Parameters:
  • name (str, optional) – The name for the layer. Default: “QuickGELU”

  • kwargs (dict) – The standard Keras Layer keyword arguments (if any)

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Call the QuickGELU layer

Parameters:

inputs (keras.KerasTensor) – The input Tensor

Returns:

The output Tensor

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Compute the output shape based on the input shape.

Parameters:

input_shape (tuple) – The input shape to the layer

Return type:

tuple[int, …]

class lib.model.layers.ReflectionPadding2D(*args, **kwargs)

Reflection-padding layer for 2D input (e.g. picture).

This layer can add rows and columns at the top, bottom, left and right side of an image tensor.

Parameters:
  • stride (int, optional) – The stride of the following convolution. Default: 2

  • kernel_size (int, optional) – The kernel size of the following convolution. Default: 5

  • kwargs (dict) – The standard Keras Layer keyword arguments (if any)

build(input_shape: KerasTensor) None

Creates the layer weights.

Must be implemented on all layers that have weights.

Parameters:

input_shape (keras.KerasTensor) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.

Return type:

None

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(*args, **kwargs) tuple[int | None, ...]

Computes the output shape of the layer.

Assumes that the layer will be built to match that input shape provided.

Returns:

An input shape tuple

Return type:

tuple

get_config() dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:

A python dictionary containing the layer configuration

Return type:

dict

class lib.model.layers.ScalarOp(*args, **kwargs)

A layer for scalar operations for migrating TFLambdaOps in Keras 2 models to Keras 3. This layer should not be used directly

Parameters:
  • operation (Literal["multiply", "truediv", "add", "subtract"]) – The scalar operation to perform

  • value (float) – The scalar value to use

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Call the Scalar operation function.

Parameters:

inputs (tensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Output shape is the same as the input shape.

Parameters:

input_shape (tuple) – The input shape to the layer

Return type:

tuple[int, …]

get_config()

Returns the config of the layer. :returns: A python dictionary containing the layer configuration :rtype: dict

class lib.model.layers.Swish(*args, **kwargs)

Swish Activation Layer implementation for Keras.

Parameters:
  • beta (float, optional) – The beta value to apply to the activation function. Default: 1.0

  • kwargs (dict) – The standard Keras Layer keyword arguments (if any)

References

Swish: a Self-Gated Activation Function: https://arxiv.org/abs/1710.05941v1

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Call the Swish Activation function.

Parameters:

inputs (tensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Compute the output shape based on the input shape.

Parameters:

input_shape (tuple) – The input shape to the layer

Return type:

tuple[int, …]

get_config()

Returns the config of the layer.

Adds the beta to config.

Returns:

A python dictionary containing the layer configuration

Return type:

dict

Classes

GlobalMinPooling2D(*args, **kwargs)

Global minimum pooling operation for spatial data.

GlobalStdDevPooling2D(*args, **kwargs)

Global standard deviation pooling operation for spatial data.

KResizeImages(*args, **kwargs)

A custom upscale function that uses keras.backend.resize_images to upsample.

L2Normalize(*args, **kwargs)

Normalizes a tensor w.r.t.

PixelShuffler(*args, **kwargs)

PixelShuffler layer for Keras.

QuickGELU(*args, **kwargs)

Applies GELU approximation that is fast but somewhat inaccurate.

ReflectionPadding2D(*args, **kwargs)

Reflection-padding layer for 2D input (e.g. picture).

ScalarOp(*args, **kwargs)

A layer for scalar operations for migrating TFLambdaOps in Keras 2 models to Keras 3.

Swish(*args, **kwargs)

Swish Activation Layer implementation for Keras.

Class Inheritance Diagram

Inheritance diagram of lib.model.layers.GlobalMinPooling2D, lib.model.layers.GlobalStdDevPooling2D, lib.model.layers.KResizeImages, lib.model.layers.L2Normalize, lib.model.layers.PixelShuffler, lib.model.layers.QuickGELU, lib.model.layers.ReflectionPadding2D, lib.model.layers.ScalarOp, lib.model.layers.Swish

lib.model.nn_blocks Module

Neural Network Blocks for faceswap.py.

class lib.model.nn_blocks.Conv2D(*args, padding: str = 'same', is_upscale: bool = False, **kwargs)

A standard Keras Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.

Parameters are the same, with the same defaults, as a standard keras.layers.Conv2D except where listed below. The default initializer is updated to HeUniform or convolutional aware based on user configuration settings.

Parameters:
  • padding (str, optional) – One of “valid” or “same” (case-insensitive). Default: “same”. Note that “same” is slightly inconsistent across backends with strides != 1, as described here.

  • is_upscale (bool, optional) – True if the convolution is being called from an upscale layer. This causes the instance to check the user configuration options to see if ICNR initialization has been selected and should be applied. This should only be passed in as True from UpscaleBlock layers. Default: False

__call__(*args, **kwargs) KerasTensor

Call the Conv2D layer

Parameters:
  • args (tuple) – Standard Conv2D layer call arguments

  • kwargs (dict[str, Any]) – Standard Conv2D layer call keyword arguments

Returns:

The Tensor from the Conv2D layer

Return type:

class: keras.KerasTensor

class lib.model.nn_blocks.Conv2DBlock(filters: int, kernel_size: int | tuple[int, int] = 5, strides: int | tuple[int, int] = 2, padding: str = 'same', normalization: str | None = None, activation: str | None = 'leakyrelu', use_depthwise: bool = False, relu_alpha: float = 0.1, **kwargs)

A standard Convolution 2D layer which applies user specified configuration to the layer.

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Adds instance normalization if requested. Adds a LeakyReLU if a residual block follows.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. NB: If use_depthwise is True then a value must still be provided here, but it will be ignored. Default: 5

  • strides (tuple or int, optional) – An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Default: 2

  • padding (["valid", "same"], optional) – The padding to use. NB: If reflect padding has been selected in the user configuration options, then this argument will be ignored in favor of reflect padding. Default: “same”

  • normalization (str or None, optional) – Normalization to apply after the Convolution Layer. Select one of “batch” or “instance”. Set to None to not apply normalization. Default: None

  • activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”

  • use_depthwise (bool, optional) – Set to True to use a Depthwise Convolution 2D layer rather than a standard Convolution 2D layer. Default: False

  • relu_alpha (float) – The alpha to use for LeakyRelu Activation. Default=`0.1`

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Convolutional Layer.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Convolution 2D Layer

Return type:

keras.KerasTensor

class lib.model.nn_blocks.Conv2DOutput(filters: int, kernel_size: int | tuple[int], activation: str = 'sigmoid', padding: str = 'same', **kwargs)

A Convolution 2D layer that separates out the activation layer to explicitly set the data type on the activation to float 32 to fully support mixed precision training.

The Convolution 2D layer uses default parameters to be more appropriate for Faceswap architecture.

Parameters are the same, with the same defaults, as a standard keras.layers.Conv2D except where listed below. The default initializer is updated to HeUniform or convolutional aware based on user config settings.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int or tuple/list of 2 ints) – The height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.

  • activation (str, optional) – The activation function to apply to the output. Default: “sigmoid”

  • padding (str, optional) –

    One of “valid” or “same” (case-insensitive). Default: “same”. Note that “same” is slightly inconsistent across backends with strides != 1, as described here.

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Convolutional Output Layer.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Convolution 2D Layer

Return type:

keras.KerasTensor

class lib.model.nn_blocks.DepthwiseConv2D(*args, padding: str = 'same', is_upscale: bool = False, **kwargs)

A standard Keras Depthwise Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.

Parameters are the same, with the same defaults, as a standard keras.layers.DepthwiseConv2D except where listed below. The default initializer is updated to HeUniform or convolutional aware based on user configuration settings.

Parameters:
  • padding (str, optional) –

    One of “valid” or “same” (case-insensitive). Default: “same”. Note that “same” is slightly inconsistent across backends with strides != 1, as described here.

  • is_upscale (bool, optional) – True if the convolution is being called from an upscale layer. This causes the instance to check the user configuration options to see if ICNR initialization has been selected and should be applied. This should only be passed in as True from UpscaleBlock layers. Default: False

__call__(*args, **kwargs) KerasTensor

Call the DepthwiseConv2D layer

Parameters:
  • args (tuple) – Standard DepthwiseConv2D layer call arguments

  • kwargs (dict[str, Any]) – Standard DepthwiseConv2D layer call keyword arguments

Returns:

The Tensor from the DepthwiseConv2D layer

Return type:

class: keras.KerasTensor

class lib.model.nn_blocks.ResidualBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', **kwargs)

Residual block from dfaker.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3

  • padding (["valid", "same"], optional) – The padding to use. Default: “same”

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

Returns:

The output tensor from the Upscale layer

Return type:

tensor

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Residual Block.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Upscale Layer

Return type:

keras.KerasTensor

class lib.model.nn_blocks.SeparableConv2DBlock(filters: int, kernel_size: int | tuple[int, int] = 5, strides: int | tuple[int, int] = 2, **kwargs)

Seperable Convolution Block.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 5

  • strides (tuple or int, optional) – An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Default: 2

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Separable Convolutional 2D layer

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Separable Convolutional 2D Block.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Upscale Layer

Return type:

keras.KerasTensor

class lib.model.nn_blocks.Upscale2xBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', activation: str | None = 'leakyrelu', interpolation: str = 'bilinear', sr_ratio: float = 0.5, scale_factor: int = 2, fast: bool = False, **kwargs)

Custom hybrid upscale layer for sub-pixel up-scaling.

Most of up-scaling is approximating lighting gradients which can be accurately achieved using linear fitting. This layer attempts to improve memory consumption by splitting with bilinear and convolutional layers so that the sub-pixel update will get details whilst the bilinear filter will get lighting.

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3

  • padding (["valid", "same"], optional) – The padding to use. Default: “same”

  • activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”

  • interpolation (["nearest", "bilinear"], optional) – Interpolation to use for up-sampling. Default: “bilinear”

  • scale_factor (int, optional) – The amount to upscale the image. Default: 2

  • sr_ratio (float, optional) – The proportion of super resolution (pixel shuffler) filters to use. Non-fast mode only. Default: 0.5

  • fast (bool, optional) – Use a faster up-scaling method that may appear more rugged. Default: False

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Upscale 2x Layer.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Upscale Layer

Return type:

keras.KerasTensor

class lib.model.nn_blocks.UpscaleBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', scale_factor: int = 2, normalization: str | None = None, activation: str | None = 'leakyrelu', **kwargs)

An upscale layer for sub-pixel up-scaling.

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3

  • padding (["valid", "same"], optional) – The padding to use. NB: If reflect padding has been selected in the user configuration options, then this argument will be ignored in favor of reflect padding. Default: “same”

  • scale_factor (int, optional) – The amount to upscale the image. Default: 2

  • normalization (str or None, optional) – Normalization to apply after the Convolution Layer. Select one of “batch” or “instance”. Set to None to not apply normalization. Default: None

  • activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Convolutional Layer.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Upscale Layer

Return type:

keras.KerasTensor

class lib.model.nn_blocks.UpscaleDNYBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', activation: str | None = 'leakyrelu', size: int = 2, interpolation: str = 'bilinear', **kwargs)

Upscale block that implements methodology similar to the Disney Research Paper using an upsampling2D block and 2 x convolutions

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

References

https://studios.disneyresearch.com/2020/06/29/high-resolution-neural-face-swapping-for-visual-effects/

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3

  • activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”

  • size (int, optional) – The amount to upscale the image. Default: 2

  • interpolation (["nearest", "bilinear"], optional) – Interpolation to use for up-sampling. Default: “bilinear”

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layers

  • padding (str)

__call__(inputs: KerasTensor) KerasTensor

Call the UpscaleDNY block

Parameters:

inputs (keras.KerasTensor) – The input to the block

Returns:

The output from the block

Return type:

keras.KerasTensor

class lib.model.nn_blocks.UpscaleResizeImagesBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', activation: str | None = 'leakyrelu', scale_factor: int = 2, interpolation: Literal['nearest', 'bilinear'] = 'bilinear')

Upscale block that uses the Keras Backend function resize_images to perform the up scaling Similar in methodology to the Upscale2xBlock

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Parameters:
  • filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)

  • kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3

  • padding (["valid", "same"], optional) – The padding to use. Default: “same”

  • activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”

  • scale_factor (int, optional) – The amount to upscale the image. Default: 2

  • interpolation (["nearest", "bilinear"], optional) – Interpolation to use for up-sampling. Default: “bilinear”

  • kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) KerasTensor

Call the Faceswap Resize Images Layer.

Parameters:

inputs (keras.KerasTensor) – The input to the layer

Returns:

The output tensor from the Upscale Layer

Return type:

keras.KerasTensor

lib.model.nn_blocks.reset_naming() None

Reset the naming convention for nn_block layers to start from 0

Used when a model needs to be rebuilt and the names for each build should be identical

Return type:

None

Functions

reset_naming()

Reset the naming convention for nn_block layers to start from 0

Classes

Conv2D(*args[, padding, is_upscale])

A standard Keras Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.

Conv2DBlock(filters[, kernel_size, strides, ...])

A standard Convolution 2D layer which applies user specified configuration to the layer.

Conv2DOutput(filters, kernel_size[, ...])

A Convolution 2D layer that separates out the activation layer to explicitly set the data type on the activation to float 32 to fully support mixed precision training.

DepthwiseConv2D(*args[, padding, is_upscale])

A standard Keras Depthwise Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.

ResidualBlock(filters[, kernel_size, padding])

Residual block from dfaker.

SeparableConv2DBlock(filters[, kernel_size, ...])

Seperable Convolution Block.

Upscale2xBlock(filters[, kernel_size, ...])

Custom hybrid upscale layer for sub-pixel up-scaling.

UpscaleBlock(filters[, kernel_size, ...])

An upscale layer for sub-pixel up-scaling.

UpscaleDNYBlock(filters[, kernel_size, ...])

Upscale block that implements methodology similar to the Disney Research Paper using an upsampling2D block and 2 x convolutions

UpscaleResizeImagesBlock(filters[, ...])

Upscale block that uses the Keras Backend function resize_images to perform the up scaling Similar in methodology to the Upscale2xBlock


lib.model.normalization Module

Normalization methods for faceswap.py specific to Torch backend

class lib.model.normalization.AdaInstanceNormalization(*args, **kwargs)

Adaptive Instance Normalization Layer for Keras.

Parameters:
  • axis (int, optional) – The axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=”channels_first”, set axis=1 in InstanceNormalization. Setting axis=None will normalize all values in each instance of the batch. Axis 0 is the batch dimension. axis cannot be set to 0 to avoid errors. Default: None

  • momentum (float, optional) – Momentum for the moving mean and the moving variance. Default: 0.99

  • epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-3

  • center (bool, optional) – If True, add offset of beta to normalized tensor. If False, beta is ignored. Default: True

  • scale (bool, optional) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. relu), this can be disabled since the scaling will be done by the next layer. Default: True

References

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization - https://arxiv.org/abs/1703.06868

build(input_shape: tuple[tuple[int, ...], ...]) None

Creates the layer weights.

Parameters:

input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.

Return type:

None

call(inputs: KerasTensor) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) int

Calculate the output shape from this layer.

Parameters:

input_shape (tuple) – The input shape to the layer

Returns:

The output shape to the layer

Return type:

int

get_config() dict[str, Any]

Returns the config of the layer.

The Keras configuration for the layer.

Returns:

A python dictionary containing the layer configuration

Return type:

dict[str, Any]

class lib.model.normalization.GroupNormalization(*args, **kwargs)

Group Normalization

Parameters:
  • axis (int, optional) – The axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=”channels_first”, set axis=1 in InstanceNormalization. Setting axis=None will normalize all values in each instance of the batch. Axis 0 is the batch dimension. axis cannot be set to 0 to avoid errors. Default: None

  • gamma_init (str, optional) – Initializer for the gamma weight. Default: “one”

  • beta_init (str, optional) – Initializer for the beta weight. Default “zero”

  • gamma_regularizer (varies, optional) – Optional regularizer for the gamma weight. Default: None

  • beta_regularizer (varies, optional) – Optional regularizer for the beta weight. Default None

  • epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-3

  • group (int, optional) – The group size. Default: 32

  • data_format (["channels_first", "channels_last"], optional) – The required data format. Optional. Default: None

  • kwargs (dict) – Any additional standard Keras Layer key word arguments

References

Shaoanlu GAN: https://github.com/shaoanlu/faceswap-GAN

build(input_shape: tuple[int, ...]) None

Creates the layer weights.

Parameters:

input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.

Return type:

None

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Calculate the output shape from this layer.

Parameters:

input_shape (tuple) – The input shape to the layer

Returns:

The output shape to the layer

Return type:

int

get_config() dict[str, Any]

Returns the config of the layer.

The Keras configuration for the layer.

Returns:

A python dictionary containing the layer configuration

Return type:

dict[str, Any]

class lib.model.normalization.InstanceNormalization(*args, **kwargs)

Instance normalization layer (Lei Ba et al, 2016, Ulyanov et al., 2016).

Normalize the activations of the previous layer at each step, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.

Parameters:
  • axis (int, optional) – The axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=”channels_first”, set axis=1 in InstanceNormalization. Setting axis=None will normalize all values in each instance of the batch. Axis 0 is the batch dimension. axis cannot be set to 0 to avoid errors. Default: None

  • epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-3

  • center (bool, optional) – If True, add offset of beta to normalized tensor. If False, beta is ignored. Default: True

  • scale (bool, optional) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. relu), this can be disabled since the scaling will be done by the next layer. Default: True

  • beta_initializer (str, optional) – Initializer for the beta weight. Default: “zeros”

  • gamma_initializer (str, optional) – Initializer for the gamma weight. Default: “ones”

  • beta_regularizer (str, optional) – Optional regularizer for the beta weight. Default: None

  • gamma_regularizer (str, optional) – Optional regularizer for the gamma weight. Default: None

  • beta_constraint (float, optional) – Optional constraint for the beta weight. Default: None

  • gamma_constraint (float, optional) – Optional constraint for the gamma weight. Default: None

References

build(input_shape: tuple[int, ...]) None

Creates the layer weights.

Parameters:

input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.

Return type:

None

call(inputs: KerasTensor) KerasTensor

This is where the layer’s logic lives.

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

Calculate the output shape from this layer.

Parameters:

input_shape (tuple) – The input shape to the layer

Returns:

The output shape to the layer

Return type:

int

get_config() dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:

A python dictionary containing the layer configuration

Return type:

dict[str, Any]

class lib.model.normalization.RMSNormalization(*args, **kwargs)

Root Mean Square Layer Normalization (Biao Zhang, Rico Sennrich, 2019)

RMSNorm is a simplification of the original layer normalization (LayerNorm). LayerNorm is a regularization technique that might handle the internal covariate shift issue so as to stabilize the layer activations and improve model convergence. It has been proved quite successful in NLP-based model. In some cases, LayerNorm has become an essential component to enable model optimization, such as in the SOTA NMT model Transformer.

RMSNorm simplifies LayerNorm by removing the mean-centering operation, or normalizing layer activations with RMS statistic.

Parameters:
  • axis (int) – The axis to normalize across. Typically this is the features axis. The left-out axes are typically the batch axis/axes. This argument defaults to -1, the last dimension in the input.

  • epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-8

  • partial (float, optional) – Partial multiplier for calculating pRMSNorm. Valid values are between 0.0 and 1.0. Setting to 0.0 or 1.0 disables. Default: 0.0

  • bias (bool, optional) – Whether to use a bias term for RMSNorm. Disabled by default because RMSNorm does not enforce re-centering invariance. Default False

  • kwargs (dict) – Standard keras layer kwargs

References

build(input_shape: tuple[int, ...]) None

Validate and populate axis

Parameters:

input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.

Return type:

None

call(inputs: KerasTensor, *args, **kwargs) KerasTensor

Call Root Mean Square Layer Normalization

Parameters:

inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors

Returns:

A tensor or list/tuple of tensors

Return type:

keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) tuple[int, ...]

The output shape of the layer is the same as the input shape.

Parameters:

input_shape (tuple[int, ...]) – The input shape to the layer

Returns:

The output shape to the layer

Return type:

tuple[int, …]

get_config() dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:

A python dictionary containing the layer configuration

Return type:

dict[str, Any]

Classes

AdaInstanceNormalization(*args, **kwargs)

Adaptive Instance Normalization Layer for Keras.

GroupNormalization(*args, **kwargs)

Group Normalization

InstanceNormalization(*args, **kwargs)

Instance normalization layer (Lei Ba et al, 2016, Ulyanov et al., 2016).

RMSNormalization(*args, **kwargs)

Root Mean Square Layer Normalization (Biao Zhang, Rico Sennrich, 2019)

Class Inheritance Diagram

Inheritance diagram of lib.model.normalization.AdaInstanceNormalization, lib.model.normalization.GroupNormalization, lib.model.normalization.InstanceNormalization, lib.model.normalization.RMSNormalization