lib.model package

The Model Package handles interfacing with the neural network backend and holds custom objects.

losses package 

lib.model.losses.feature_loss Module 

Custom Feature Map Loss Functions for faceswap.py

class lib.model.losses.feature_loss.LPIPSLoss(trunk_network: Literal['alex', 'squeeze', 'vgg16'], trunk_pretrained: bool = True, trunk_eval_mode: bool = True, linear_pretrained: bool = True, linear_eval_mode: bool = True, linear_use_dropout: bool = True, lpips: bool = True, spatial_output: bool = True, normalize: bool = True, ret_per_layer: bool = False, crop: bool = False, color_order: Literal['bgr', 'rgb'] = 'bgr')

LPIPS Loss Function.

A perceptual loss function that uses linear outputs from pretrained CNNs feature layers.

Notes

Channels Last implementation. All trunks implemented from the original paper.

References

https://richzhang.github.io/PerceptualSimilarity/

Parameters:

trunk_network (T.Literal['alex', 'squeeze', 'vgg16']) – The name of the trunk network to use. One of “alex”, “squeeze” or “vgg16”
trunk_pretrained (bool) – True Load the imagenet pretrained weights for the trunk network. False randomly initialize the trunk network. Default: True
trunk_eval_mode (bool) – True for running inference on the trunk network (standard mode), False for training the trunk network. Default: True
linear_pretrained (bool) – True loads the pretrained weights for the linear network layers. False randomly initializes the layers. Default: True
linear_eval_mode (bool) – True for running inference on the linear network (standard mode), False for training the linear network. Default: True
linear_use_dropout (bool) – True if a dropout layer should be used in the Linear network otherwise False. Default: True
lpips (bool) – True to use linear network on top of the trunk network. False to just average the output from the trunk network. Default True
spatial_output (bool) – True output the loss in the spatial domain (i.e. as a grayscale tensor of height and width of the input image). Bool reduce the spatial dimensions for loss calculation. Default: True
normalize (bool) – True if the input Tensor needs to be normalized from the 0. to 1. range to the -1. to 1. range. Default: True
ret_per_layer (bool) – True to return the loss value per feature output layer otherwise False. Default: False
crop (bool) – Crop the zero-padded borders from the feature maps. Can help reduce moire pattern. Default: False
color_order (T.Literal['bgr', 'rgb']) – The RGB/BGR order of the input images

forward(y_true: Tensor, y_pred: Tensor) → Tensor | tuple[Tensor, list[Tensor]]

Perform the LPIPS Loss Function.

Parameters:

y_true (Tensor) – The ground truth batch of images
y_pred (Tensor) – The predicted batch of images

Return type:

The final loss value for each item in the batch

class lib.model.losses.feature_loss.NetInfo(model_id: int = 0, model_name: str = '', net: Callable | None = None, outputs: list[str] | list[int] = <factory>, pad_amount: list[int] | int = 0)

Data class for holding information about Trunk and Linear Layer nets.

Parameters:

model_id (int) – The model ID for the model stored in the deepfakes Model repo
model_name (str) – The filename of the decompressed model/weights file
net (Callable | None) – The net definition to load, if any. Default:None
outputs (list[str] | list[int]) – For trunk networks the name of the output feature layers. For linear networks the number of input channels to each layer
pad_amount (list[int] | int) – For trunk networks, the amount of zero padding applied to each feature output

Classes

`LPIPSLoss`(trunk_network[, trunk_pretrained, ...])	LPIPS Loss Function.
`NetInfo`(model_id, model_name, net, outputs, ...)	Data class for holding information about Trunk and Linear Layer nets.

Class Inheritance Diagram

Inheritance diagram of lib.model.losses.feature_loss.LPIPSLoss, lib.model.losses.feature_loss.NetInfo

lib.model.losses.loss Module 

Custom Loss Functions for faceswap.py

class lib.model.losses.loss.FocalFrequencyLoss(alpha: float = 1.0, patch_factor: int = 1, ave_spectrum: bool = False, log_matrix: bool = False, batch_matrix: bool = False, epsilon: float = 1e-06, spatial_output: bool = True)

Focal frequency Loss Function.

Parameters:

alpha (float) – Scaling factor of the spectrum weight matrix for flexibility. Default: 1.0
patch_factor (int) – Factor to crop image patches for patch-based focal frequency loss. Default: 1
ave_spectrum (bool) – True to use mini-batch average spectrum otherwise False. Default: False
log_matrix (bool) – True to adjust the spectrum weight matrix by logarithm otherwise False. Default: False
batch_matrix (bool) – True to calculate the spectrum weight matrix using batch-based statistics otherwise False. Default: False
epsilon (float) – Small epsilon for safer weights scaling division. Default: 1e-6
spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

https://arxiv.org/pdf/2012.12821.pdf https://github.com/EndlessSora/focal-frequency-loss

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the Focal Frequency Loss Function.

Parameters:

y_true (Tensor) – The ground truth batch of images
y_pred (Tensor) – The predicted batch of images

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.GeneralizedLoss(alpha: float = 1.0, beta: float = 0.00392156862745098, spatial_output: bool = True)

Generalized function used to return a large variety of mathematical loss functions.

The primary benefit is a smooth, differentiable version of L1 loss.

References

Barron, J. A General and Adaptive Robust Loss Function - https://arxiv.org/pdf/1701.03077.pdf

Example

>>> a=1.0, x>>c , c=1.0/255.0  # will give a smoothly differentiable version of L1 / MAE loss
>>> a=1.999999 (limit as a->2), beta=1.0/255.0 # will give L2 / RMSE loss

Parameters:

alpha (float) – Penalty factor. Larger number give larger weight to large deviations. Default: 1.0
beta (float) – Scale factor used to adjust to the input scale (i.e. inputs of mean 1e-4 or 256). Default: 1.0/255.0
spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the Generalized Loss Function

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.GradientLoss(spatial_output: bool = True)

Gradient Loss Function.

Calculates the first and second order gradient difference between pixels of an image in the x and y dimensions. These gradients are then compared between the ground truth and the predicted image and the difference is taken. When used as a loss, its minimization will result in predicted images approaching the same level of sharpness / blurriness as the ground truth.

Parameters:: spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

TV+TV2 Regularization with Non-Convex Sparseness-Inducing Penalty for Image Restoration, Chengwu Lu & Hua Huang, 2014 - http://downloads.hindawi.com/journals/mpe/2014/790547.pdf

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the gradient loss function.

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.LInfNorm(*args: Any, **kwargs: Any)

Calculate the L-inf norm as a loss function.

Parameters:

args (Any)
kwargs (Any)

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the L-inf norm loss function.

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.LaplacianPyramidLoss(max_levels: int = 5, gaussian_size: int = 5, gaussian_sigma: float = 1.0, spatial_output: bool = True)

Laplacian Pyramid Loss Function

Notes

Channels last implementation on square images only.

Parameters:

max_levels (int) – The max number of laplacian pyramid levels to use. Default: 5
gaussian_size (int) – The size of the gaussian kernel. Default: 5
gaussian_sigma (float) – The gaussian sigma. Default: 2.0
device – The device to place the variables onto. Default: “cpu”
spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

https://arxiv.org/abs/1707.05776 https://github.com/nathanaelbosch/generative-latent-optimization/blob/master/utils.py

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Calculate the Laplacian Pyramid Loss.

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.loss.LogCosh(spatial_output: bool = True)

Logarithm of the hyperbolic cosine of the prediction error. Ported from Keras implementation

Parameters:: spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the LogCosh loss function.

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

Classes

`FocalFrequencyLoss`([alpha, patch_factor, ...])	Focal frequency Loss Function.
`GeneralizedLoss`([alpha, beta, spatial_output])	Generalized function used to return a large variety of mathematical loss functions.
`GradientLoss`([spatial_output])	Gradient Loss Function.
`LInfNorm`(args, *kwargs)	Calculate the L-inf norm as a loss function.
`LaplacianPyramidLoss`([max_levels, ...])	Laplacian Pyramid Loss Function
`LogCosh`([spatial_output])	Logarithm of the hyperbolic cosine of the prediction error.

Class Inheritance Diagram

Inheritance diagram of lib.model.losses.loss.FocalFrequencyLoss, lib.model.losses.loss.GeneralizedLoss, lib.model.losses.loss.GradientLoss, lib.model.losses.loss.LInfNorm, lib.model.losses.loss.LaplacianPyramidLoss, lib.model.losses.loss.LogCosh

lib.model.losses.perceptual_loss Module 

Keras implementation of Perceptual Loss Functions for faceswap.py

class lib.model.losses.perceptual_loss.GMSDLoss(spatial_output: bool = True)

Gradient Magnitude Similarity Deviation Loss.

Improved image quality metric over MS-SSIM with easier calculations

Parameters:: spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True

References

http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm https://arxiv.org/ftp/arxiv/papers/1308/1308.3052.pdf

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Return the Gradient Magnitude Similarity Deviation Loss.

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The final loss value for each item in the batch

class lib.model.losses.perceptual_loss.MSSIMLoss(max_val: float = 1.0, filter_size: int = 11, filter_sigma: float = 1.5, k1: float = 0.01, k2: float = 0.03, spatial_output: bool = True, power_factors: tuple[float, ...] = (0.0448, 0.2856, 0.3001, 0.2363, 0.1333))

Computes the MS-SSIM between img1 and img2.

This function assumes that img1 and img2 are image batches, i.e. the last three dimensions are [height, width, channels].

Note: The true SSIM is only defined on grayscale. This function does not perform any color-space transform. (If the input is already YUV, then it will compute YUV SSIM average.)

Original paper: Wang, Zhou, Eero P. Simoncelli, and Alan C. Bovik. “Multiscale structural similarity for image quality assessment.” Signals, Systems and Computers, 2004.

Details:

11x11 Gaussian filter of width 1.5 is used.
k1 = 0.01, k2 = 0.03 as in the original paper.

The filter is reduced in size if the smallest image is smaller than 11x11.

Parameters:

max_val (float) – The dynamic range of the images (i.e., the difference between the maximum the and minimum allowed values). Default 1.0 (0.0 - 1.0)
filter_size (int) – Size of gaussian filter. Default: 11
filter_sigma (float) – Width of gaussian filter. Default: 1.5
k1 (float) – The K1 value. Default: 0.01
k2 (float) – The K2 value. Default: 0.03 (SSIM is less sensitivity to K2 for lower values, so it would be better if we took the values in the range of 0 < K2 < 0.4).
spatial_output (bool) – True to output the loss values spatially. False as scalar per item. Default: True
power_factors (tuple[float, ...]) – Iterable of weights for each of the scales. The number of scales used is the length of the list. Index 0 is the unscaled resolution’s weight and each increasing scale corresponds to the image being downsampled by 2. Defaults to the values obtained in the original paper. Default: (0.0448, 0.2856, 0.3001, 0.2363, 0.1333)
Reference
---------
https (//github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/ops/image_ops_impl.py)

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the MS-SSIM Loss Function.

Parameters:

y_true (Tensor) – The ground truth value
y_pred (Tensor) – The predicted value

Return type:

The MS-SSIM Loss value

class lib.model.losses.perceptual_loss.SSIMLoss(max_val: float = 1.0, filter_size: int = 11, filter_sigma: float = 1.5, k1: float = 0.01, k2: float = 0.03, spatial_output: bool = True)

Computes SSIM index between img1 and img2.

This function is based on the standard SSIM implementation from: Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing.

Note: The true SSIM is only defined on grayscale. This function does not perform any color-space transform. (If the input is already YUV, then it will compute YUV SSIM average.)

Details:

11x11 Gaussian filter of width 1.5 is used.
k1 = 0.01, k2 = 0.03 as in the original paper.

The filter is reduced in size of the image is smaller than 11x11.

Reference

https://github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/ops/image_ops_impl.py

forward(y_true: Tensor, y_pred: Tensor) → Tensor

Call the SSIM Loss Function.

Parameters:

y_true (Tensor) – The input batch of ground truth images
y_pred (Tensor) – The input batch of predicted images

Return type:

The final SSIM for each item in the batch

Parameters:

max_val (float)
filter_size (int)
filter_sigma (float)
k1 (float)
k2 (float)
spatial_output (bool)

Classes

`GMSDLoss`([spatial_output])	Gradient Magnitude Similarity Deviation Loss.
`MSSIMLoss`([max_val, filter_size, ...])	Computes the MS-SSIM between img1 and img2.
`SSIMLoss`([max_val, filter_size, ...])	Computes SSIM index between img1 and img2.

Class Inheritance Diagram

Inheritance diagram of lib.model.losses.perceptual_loss.GMSDLoss, lib.model.losses.perceptual_loss.MSSIMLoss, lib.model.losses.perceptual_loss.SSIMLoss

networks package 

lib.model.networks.clip Module 

CLIP: https://github.com/openai/CLIP. This implementation only ports the visual transformer part of the model.

class lib.model.networks.clip.AttentionPool2d(spatial_dim: int, embed_dim: int, num_heads: int, output_dim: int | None = None, name='AttentionPool2d')

An Attention Pooling layer that applies a multi-head self-attention mechanism over a spatial grid of features.

Parameters:

spatial_dim (int) – The dimensionality of the spatial grid of features.
embed_dim (int) – The dimensionality of the feature embeddings.
num_heads (int) – The number of attention heads.
output_dim (int) – The output dimensionality of the attention layer. If None, it defaults to embed_dim.
name (str) – The name of the layer.

__call__(inputs: KerasTensor) → KerasTensor

Performs the attention pooling operation on the input tensor.

Parameters:: inputs (keras.KerasTensor:) – The input tensor of shape [batch_size, height, width, embed_dim].
Return type:: keras.KerasTensor:: The result of the attention pooling operation

class lib.model.networks.clip.Bottleneck(inplanes: int, planes: int, stride: int = 1, name: str = 'bottleneck')

A ResNet bottleneck block that performs a sequence of convolutions, batch normalization, and ReLU activation operations on an input tensor.

Parameters:

inplanes (int) – The number of input channels.
planes (int) – The number of output channels.
stride (int, optional) – The stride of the bottleneck block. Default: 1
name (str, optional) – The name of the bottleneck block. Default: “bottleneck”

__call__(inputs: KerasTensor) → KerasTensor

Performs the forward pass for a Bottleneck block.

All conv layers have stride 1. an avgpool is performed after the second convolution when stride > 1

Parameters:: inputs (keras.KerasTensor) – The input tensor to the Bottleneck block.
Returns:: The result of the forward pass through the Bottleneck block.
Return type:: keras.KerasTensor

expansion = 4

The factor by which the number of input channels is expanded to get the number of output channels.

Type:: int

class lib.model.networks.clip.ClassEmbedding(*args, **kwargs)

Trainable Class Embedding layer

Parameters:

input_shape (tuple[int, ...])
scale (int)
name (str)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Get the Class Embedding layer

Parameters:: inputs (keras.KerasTensor) – Input tensor to the embedding layer
Returns:: The class embedding layer shaped for the input tensor
Return type:: keras.KerasTensor

class lib.model.networks.clip.EmbeddingLayer(*args, **kwargs)

Parent class for trainable embedding variables

Parameters:

input_shape (tuple[int, ...]) – The shape of the variable
scale (int) – Amount to scale the random initialization by
name (str) – The name of the layer
dtype (str, optional) – The datatype for the layer. Mixed precision can mess up the embeddings. Default: “float32”

build(input_shape: tuple[int, ...]) → None

Add the weights

Parameters:: input_shape (tuple[int, ...) – The input shape of the incoming tensor
Return type:: None

get_config() → dict[str, Any]

Get the config dictionary for the layer

Returns:: The config dictionary for the layer
Return type:: dict[str, Any]

class lib.model.networks.clip.ModifiedResNet(input_resolution: int, width: int, layer_config: tuple[int, int, int, int], output_dim: int, heads: int, name='ModifiedResNet')

A ResNet class that is similar to torchvision’s but contains the following changes:

There are now 3 “stem” convolutions as opposed to 1, with an average pool instead of a max pool.
Performs anti-aliasing strided convolutions, where an avgpool is prepended to convolutions with stride > 1
The final pooling layer is a QKV attention instead of an average pool

Parameters:

input_resolution (int) – The input resolution of the model. Default is 224.
width (int) – The width of the model. Default is 64.
layer_config (list) – A list containing the number of Bottleneck blocks for each layer.
output_dim (int) – The output dimension of the model.
heads (int) – The number of heads for the QKV attention.
name (str) – The name of the model. Default is “ModifiedResNet”.

__call__() → Model

Implements the forward pass of the ModifiedResNet model.

Returns:: The modified resnet model.
Return type:: keras.models.Model

class lib.model.networks.clip.PositionalEmbedding(*args, **kwargs)

Trainable Positional Embedding layer

Parameters:

input_shape (tuple[int, ...])
scale (int)
name (str)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Get the Positional Embedding layer

Parameters:: inputs (keras.KerasTensor) – Input tensor to the embedding layer
Returns:: The positional embedding layer shaped for the input tensor
Return type:: keras.KerasTensor

class lib.model.networks.clip.Projection(*args, **kwargs)

Trainable Projection Embedding Layer

Parameters:

input_shape (tuple[int, ...])
scale (int)
name (str)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Get the Projection layer

Parameters:: inputs (keras.KerasTensor) – Input tensor to the embedding layer
Returns:: The Projection layer expanded to the batch dimension and transposed for matmul
Return type:: keras.KerasTensor

class lib.model.networks.clip.Transformer(width: int, num_layers: int, heads: int, attn_mask: KerasTensor = None, name: str = 'transformer')

A class representing a Transformer model with attention mechanism and residual connections.

Parameters:

width (int) – The dimension of the input and output vectors.
num_layers (int) – The number of layers in the Transformer.
heads (int) – The number of attention heads.
attn_mask (keras.KerasTensor, optional) – The attention mask, by default None.
name (str, optional) – The name of the Transformer model, by default “transformer”.

__call__() → :class:`keras.models.Model`:

Calls the Transformer layers.

Parameters:: inputs (KerasTensor)
Return type:: KerasTensor

__call__(inputs: KerasTensor) → KerasTensor

Call the Transformer layers

Parameters:: inputs (keras.KerasTensor) – The input Tensor
Returns:: The return Tensor
Return type:: keras.KerasTensor

residual_attention_block(inputs: KerasTensor, key_dim: int, num_heads: int, attn_mask: KerasTensor, name: str = 'ResidualAttentionBlock') → KerasTensor

Call the residual attention block

Parameters:

inputs (keras.KerasTensor) – The input Tensor
key_dim (int) – key dimension per head for MultiHeadAttention
num_heads (int) – Number of heads for MultiHeadAttention
attn_mask (keras.KerasTensor, optional) – Default: None
name (str, optional) – The name for the layer. Default: “ResidualAttentionBlock”

Returns:

The return Tensor

Return type:

keras.KerasTensor

class lib.model.networks.clip.ViT(name: Literal['RN50', 'RN101', 'RN50x4', 'RN50x16', 'RN50x64', 'ViT-B-16', 'ViT-B-32', 'ViT-L-14', 'ViT-L-14-336px', 'FaRL-B-16-16', 'FaRL-B-16-64'], input_size: int | None = None, load_weights: bool = False)

Visiual Transform from CLIP

A Convolutional Language-Image Pre-Training (CLIP) model that encodes images and text into a shared latent space.

Reference

https://arxiv.org/abs/2103.00020

param name:: “ViT-B-16”, “ViT-L-14”, “ViT-L-14-336px”, “FaRL-B_16-64”] The model configuration to use
type name:: [“RN50”, “RN101”, “RN50x4”, “RN50x16”, “RN50x64”, “ViT-B-32”,
param input_size:: The required resolution size for the model. None for default preset size
type input_size:: int, optional
param load_weights:: True to load pretrained weights. Default: False
type load_weights:: bool, optional

__call__() → Model

Get the configured ViT model

Returns:: The requested Visual Transformer model
Return type:: keras.models.Model

Parameters:

name (TypeModels)
input_size (int | None)
load_weights (bool)

class lib.model.networks.clip.ViTConfig(embed_dim: int, resolution: int, layer_conf: int | tuple[int, int, int, int], width: int, patch: int, git_id: int = 0)

Configuration settings for ViT

Parameters:

embed_dim (int) – Dimensionality of the final shared embedding space
resolution (int) – Spatial resolution of the input images
layer_conf (tuple[int, int, int, int] | int) – Number of layers in the visual encoder, or a tuple of layer configurations for a custom ResNet visual encoder
width (int) – Width of the visual encoder layers
patch (int) – Size of the patches to be extracted from the images. Only used for Visual encoder.
git_id (int, optional) – The id of the model weights file stored in deepfakes_models repo if they exist. Default: 0

class lib.model.networks.clip.VisualTransformer(input_resolution: int, patch_size: int, width: int, num_layers: int, heads: int, output_dim: int, name: str = 'VisualTransformer')

A class representing a Visual Transformer model for image classification tasks.

Parameters:

input_resolution (int) – The input resolution of the images.
patch_size (int) – The size of the patches to be extracted from the images.
width (int) – The dimension of the input and output vectors.
num_layers (int) – The number of layers in the Transformer.
heads (int) – The number of attention heads.
output_dim (int) – The dimension of the output vector.
name (str, optional) – The name of the Visual Transformer model, Default: “VisualTransformer”.

__call__() → :class:`keras.models.Model`:

Builds and returns the Visual Transformer model.

Return type:: Model

__call__() → Model

Builds and returns the Visual Transformer model.

Returns:: The Visual Transformer model.
Return type:: keras.models.Model

Classes

`AttentionPool2d`(spatial_dim, embed_dim, ...)	An Attention Pooling layer that applies a multi-head self-attention mechanism over a spatial grid of features.
`Bottleneck`(inplanes, planes[, stride, name])	A ResNet bottleneck block that performs a sequence of convolutions, batch normalization, and ReLU activation operations on an input tensor.
`ClassEmbedding`(args, *kwargs)	Trainable Class Embedding layer
`EmbeddingLayer`(args, *kwargs)	Parent class for trainable embedding variables
`ModifiedResNet`(input_resolution, width, ...)	A ResNet class that is similar to torchvision's but contains the following changes:
`PositionalEmbedding`(args, *kwargs)	Trainable Positional Embedding layer
`Projection`(args, *kwargs)	Trainable Projection Embedding Layer
`Transformer`(width, num_layers, heads[, ...])	A class representing a Transformer model with attention mechanism and residual connections.
`ViT`(name[, input_size, load_weights])	Visiual Transform from CLIP
`ViTConfig`(embed_dim, resolution, layer_conf, ...)	Configuration settings for ViT
`VisualTransformer`(input_resolution, ...[, name])	A class representing a Visual Transformer model for image classification tasks.

Class Inheritance Diagram

Inheritance diagram of lib.model.networks.clip.AttentionPool2d, lib.model.networks.clip.Bottleneck, lib.model.networks.clip.ClassEmbedding, lib.model.networks.clip.EmbeddingLayer, lib.model.networks.clip.ModifiedResNet, lib.model.networks.clip.PositionalEmbedding, lib.model.networks.clip.Projection, lib.model.networks.clip.Transformer, lib.model.networks.clip.ViT, lib.model.networks.clip.ViTConfig, lib.model.networks.clip.VisualTransformer

lib.model.networks.insightface_resnet Module 

InsightFace ResNet (IR) and InsightFace ResNet Squeeze + Excite (IRSE) for inference

From: https://github.com/deepinsight/insightface and https://github.com/HuangYG123/CurricularFace

Released under MIT License

class lib.model.networks.insightface_resnet.BasicBlockIR(in_channels: int, depth: int, stride: int, use_se: bool)

A Basic Block for InsightFace ResNet

Parameters:

in_channels (int) – The number of input channels to the layer
depth (int) – The depth of the layer
stride (int) – The Convolution stride
use_se (bool) – True to add squeeze and excite layer

forward(inputs: Tensor) → Tensor

Forward pass through the IRNet basic block

Parameters:: inputs (Tensor) – The input to the IRNet Block
Return type:: The output from the IRNet Block

class lib.model.networks.insightface_resnet.BottleneckIR(in_channels: int, depth: int, stride: int, use_se: bool)

Bottleneck for IRNet

Parameters:

in_channels (int) – The number of input channels to the layer
depth (int) – The depth of the layer
stride (int) – The Convolution stride
use_se (bool) – True to add squeeze and excite layer

forward(inputs: Tensor) → Tensor

Forward pass through the IRNet Bottleneck

Parameters:: inputs (Tensor) – The input to the IRNet Bottleneck
Return type:: The output from the IRNet Bottleneck

class lib.model.networks.insightface_resnet.Flatten(*args: Any, **kwargs: Any)

Flatten layer for IRNet

Parameters:

args (Any)
kwargs (Any)

forward(inputs: Tensor) → Tensor

Flatten the inbound layer

Parameters:: inputs (Tensor) – The input layer to be flattened
Return type:: The flattened input layer

class lib.model.networks.insightface_resnet.IRNet(input_size: Literal[112, 224], block_filters: tuple[int, int, int, int], block_recursions: tuple[int, int, int, int], num_features: int = 512, use_se: bool = False, use_bottleneck: bool = False)

Implementation if InsightFace ResNet with Squeeze + Excite support

Parameters:

input_size (Literal[112, 224]) – The input size to the model. Must be 112 or 224
block_filters (tuple[int, int, int, int]) – The number of in_channels to each block layer for each pass
block_recursions (tuple[int, int, int, int]) – The number of recursions within each block
num_features (int) – The number of num_features to output. Default: 512
use_se (bool) – True to use Squeeze and Excite. False to use standard IR ResNet. Default: False
use_bottleneck (bool) – True to use the Bottleneck block. False to use the Basic block. Default: False

forward(inputs: Tensor) → Tensor

Forward pass through IRNet

Parameters:: inputs (Tensor) – The input to IRNet
Return type:: The output from IRNet

class lib.model.networks.insightface_resnet.SEModule(in_channels: int, reduction: int)

Squeeze and Excite Block for IRNet

Parameters:

in_channels (int) – The number of input channels
reduction (int) – The reduction factor for squeeze and excite

forward(inputs: Tensor) → Tensor

Forward pass through the IRNet Squeeze and Excite Block

Parameters:: inputs (Tensor)
Return type:: Tensor

lib.model.networks.insightface_resnet.ir_101(input_size: Literal[112, 224]) → IRNet

Obtain an IRNet-101 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_152(input_size: Literal[112, 224]) → IRNet

Obtain an IRNet-152 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_18(input_size: Literal[112, 224])

Obtain an IRNet-18 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model

lib.model.networks.insightface_resnet.ir_200(input_size: Literal[112, 224]) → IRNet

Obtain an IRNet-200 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_34(input_size: Literal[112, 224]) → IRNet

Obtain an IRNet-34 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_50(input_size: Literal[112, 224]) → IRNet

Obtain an IRNet-50 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_se_101(input_size: Literal[112, 224]) → IRNet

Obtain an IRNetSE101 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_se_152(input_size: Literal[112, 224]) → IRNet

Obtain an IRNetSE152 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_se_200(input_size: Literal[112, 224]) → IRNet

Obtain an IRNetSE200 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

lib.model.networks.insightface_resnet.ir_se_50(input_size: Literal[112, 224]) → IRNet

Obtain an IRNetSE50 model

Parameters:: input_size (Literal[112, 224]) – The input size to the model
Return type:: IRNet

Functions

`ir_101`(input_size)	Obtain an IRNet-101 model
`ir_152`(input_size)	Obtain an IRNet-152 model
`ir_18`(input_size)	Obtain an IRNet-18 model
`ir_200`(input_size)	Obtain an IRNet-200 model
`ir_34`(input_size)	Obtain an IRNet-34 model
`ir_50`(input_size)	Obtain an IRNet-50 model
`ir_se_101`(input_size)	Obtain an IRNetSE101 model
`ir_se_152`(input_size)	Obtain an IRNetSE152 model
`ir_se_200`(input_size)	Obtain an IRNetSE200 model
`ir_se_50`(input_size)	Obtain an IRNetSE50 model

Classes

`BasicBlockIR`(in_channels, depth, stride, use_se)	A Basic Block for InsightFace ResNet
`BottleneckIR`(in_channels, depth, stride, use_se)	Bottleneck for IRNet
`Flatten`(args, *kwargs)	Flatten layer for IRNet
`IRNet`(input_size, block_filters, ...[, ...])	Implementation if InsightFace ResNet with Squeeze + Excite support
`SEModule`(in_channels, reduction)	Squeeze and Excite Block for IRNet

Class Inheritance Diagram

Inheritance diagram of lib.model.networks.insightface_resnet.BasicBlockIR, lib.model.networks.insightface_resnet.BottleneckIR, lib.model.networks.insightface_resnet.Flatten, lib.model.networks.insightface_resnet.IRNet, lib.model.networks.insightface_resnet.SEModule

optimizers package 

lib.model.optimizers.adabelief Module 

AdaBelief optimizer for Torch

class lib.model.optimizers.adabelief.AdaBelief(params: Iterable, lr: float = 0.001, betas: tuple[float, float] = (0.9, 0.999), eps: float = 1e-16, weight_decay: float = 0.0, amsgrad: bool = False, weight_decouple: bool = True, fixed_decay: bool = False, rectify: bool = True, degenerated_to_sgd: bool = True)

Implements AdaBelief algorithm. Modified from Adam in PyTorch

Parameters:

params (Iterable) – Iterable of parameters to optimize or dicts defining parameter groups
lr (float) – Learning rate. Default: 1e-3
betas (tuple[float, float]) – Coefficients used for computing running averages of gradient and its square. Default: (0.9, 0.999)
eps (float) – Term added to the denominator to improve numerical stability. Default: 1e-16
weight_decay (float) – Weight decay (L2 penalty). Default: 0
amsgrad (bool) – Whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond. Default: False
weight_decouple (bool) – If set as True, then the optimizer uses decoupled weight decay as in AdamW. Default: True
fixed_decay (bool) – This is used when weight_decouple is set as True. - When fixed_decay == True, the weight decay is performed as W_{new} = W_{old} - W_{old} * decay. - When fixed_decay == False, the weight decay is performed as W_{new} = W_{old} - W_{old} * decay * lr. Note that in this case, the weight decay ratio decreases with learning rate (lr). Default: False
rectify (bool) – If set as True, then perform the rectified update similar to RAdam. Default: True
degenerated_to_sgd (bool) – If set as True, then perform SGD update when variance of gradient is high. Default: True
Reference
---------
Optimizer (AdaBelief)
gradients (adapting step sizes by the belief in observed)
2020 (NeurIPS)
https (//github.com/juntang-zhuang/Adabelief-Optimizer)

reset() → None

Reset parameters

Return type:: None

step(closure: Callable | None = None) → Tensor

Performs a single optimization step.

Parameters:: closure (Callable | None) – A closure that reevaluates the model and returns the loss. Default: None
Return type:: Tensor

Classes

AdaBelief(params[, lr, betas, eps, ...])

Implements AdaBelief algorithm.

Class Inheritance Diagram

Inheritance diagram of lib.model.optimizers.adabelief.AdaBelief

lib.model.optimizers.lion Module 

PyTorch implementation of the Lion optimizer.

class lib.model.optimizers.lion.Lion(params: Iterable, lr: float = 0.0001, betas: tuple[float, float] = (0.9, 0.99), weight_decay: float = 0.0)

Lion optimizer from Google

Parameters:

params (Iterable) – Iterable of parameters to optimize or dicts defining parameter groups
lr (float) – Learning rate. Default: 1e-4
betas (tuple[float, float]) – Coefficients used for computing running averages of gradient and its square. Default: (0.9, 0.99)
weight_decay (float) – Weight decay coefficient. Default: 0
Reference
---------
https (//github.com/google/automl/blob/master/lion/lion_pytorch.py)

step(closure: Callable | None = None) → Tensor

Performs a single optimization step.

Parameters:: closure (Callable | None) – A closure that reevaluates the model and returns the loss.
Return type:: The loss

Classes

Lion(params[, lr, betas, weight_decay])

Lion optimizer from Google

Class Inheritance Diagram

Inheritance diagram of lib.model.optimizers.lion.Lion

lib.model.optimizers.keras_legacy Module 

Legacy keras Optimizers for weight migration

class lib.model.optimizers.keras_legacy.AdaBelief(*args, **kwargs)

Implementation of the AdaBelief Optimizer

Inherits from: keras.optimizers.Optimizer.

AdaBelief Optimizer is not a placement of the heuristic warmup, the settings should be kept if warmup has already been employed and tuned in the baseline method. You can enable warmup by setting total_steps and warmup_proportion (see examples)

Lookahead (see references) can be integrated with AdaBelief Optimizer, which is announced by Less Wright and the new combined optimizer can also be called “Ranger”. The mechanism can be enabled by using the lookahead wrapper. (See examples)

Parameters:

learning_rate (float) – The learning rate.
beta_1 (float) – The exponential decay rate for the 1st moment estimates.
beta_2 (float) – The exponential decay rate for the 2nd moment estimates.
epsilon (float) – A small constant for numerical stability.
amsgrad (bool) – Whether to apply AMSGrad variant of this algorithm from the paper “On the Convergence of Adam and beyond”.
rectify (bool) – Whether to enable rectification as in RectifiedAdam
sma_threshold (float) – The threshold for simple mean average.
total_steps (int) – Total number of training steps. Enable warmup by setting a positive value.
warmup_proportion (float) – The proportion of increasing steps.
min_lr – Minimum learning rate after warmup.
name – Name for the operations created when applying gradients. Default: "AdaBeliefOptimizer".
**kwargs – Standard Keras Optimizer keyword arguments. Allowed to be (weight_decay, clipnorm, clipvalue, global_clipnorm, use_ema, ema_momentum, ema_overwrite_frequency, loss_scale_factor, gradient_accumulation_steps)
min_learning_rate (float)

Examples

>>> from optimizers import AdaBelief
>>> opt = AdaBelief(lr=1e-3)

Example of serialization:

>>> optimizer = AdaBelief(learning_rate=lr_scheduler, weight_decay=wd_scheduler)
>>> config = keras.optimizers.serialize(optimizer)
>>> new_optimizer = keras.optimizers.deserialize(config,
...                                                 custom_objects=dict(AdaBelief=AdaBelief))

Example of warm up:

>>> opt = AdaBelief(lr=1e-3, total_steps=10000, warmup_proportion=0.1, min_lr=1e-5)

In the above example, the learning rate will increase linearly from 0 to lr in 1000 steps, then decrease linearly from lr to min_lr in 9000 steps.

Example of enabling Lookahead:

>>> adabelief = AdaBelief()
>>> ranger = tfa.optimizers.Lookahead(adabelief, sync_period=6, slow_step_size=0.5)

Notes

amsgrad is not described in the original paper. Use it with caution.

References

Juntang Zhuang et al. - AdaBelief Optimizer: Adapting step sizes by the belief in observed gradients - https://arxiv.org/abs/2010.07468.

Original implementation - https://github.com/juntang-zhuang/Adabelief-Optimizer

Michael R. Zhang et.al - Lookahead Optimizer: k steps forward, 1 step back - https://arxiv.org/abs/1907.08610v1

Adapted from https://github.com/juntang-zhuang/Adabelief-Optimizer

BSD 2-Clause License

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

build(variables: list[Variable]) → None

Initialize optimizer variables.

AdaBelief optimizer has 3 types of variables: momentums, velocities and velocity_hat (only set when amsgrad is applied),

Parameters:: variables (list[Variable]) – list of model variables to build AdaBelief variables on.
Return type:: None

get_config() → dict[str, Any]

Returns the config of the optimizer.

Optimizer configuration for AdaBelief.

Returns:: The optimizer configuration.
Return type:: dict[str, Any]

update_step(gradient: Tensor, variable: Variable, learning_rate: Tensor) → None

Update step given gradient and the associated model variable for AdaBelief.

Parameters:

gradient (Tensor) – The gradient to update
variable (Variable) – The variable to update
learning_rate (Tensor) – The learning rate

Return type:

None

Classes

AdaBelief(*args, **kwargs)

Implementation of the AdaBelief Optimizer

Class Inheritance Diagram

Inheritance diagram of lib.model.optimizers.keras_legacy.AdaBelief

model package 

lib.model.autoclip Module 

Auto clipper for clipping gradients.

class lib.model.autoclip.AutoClipper(clip_percentile: int, history_size: int = 10000)

AutoClip: Adaptive Gradient Clipping for Source Separation Networks

Parameters:

clip_percentile (int) – The percentile to clip the gradients at
history_size (int) – The number of iterations of data to use to calculate the norm Default: 10000

References

Adapted from: https://github.com/pseeth/autoclip original paper: https://arxiv.org/abs/2007.14469

__call__(parameters: list[Parameter], *args) → None

Call the AutoClip function.

Parameters:

parameters (list[Parameter]) – The parameters to clip
args – Unused but for compatibility

Return type:

None

Classes

AutoClipper(clip_percentile[, history_size])

AutoClip: Adaptive Gradient Clipping for Source Separation Networks

lib.model.backup_restore Module 

Functions for backing up, restoring and creating model snapshots.

class lib.model.backup_restore.Backup(model_dir: str, model_name: str)

Performs the back up of models at each save iteration, and the restoring of models from their back up location.

Parameters:

model_dir (str) – The folder that contains the model to be backed up
model_name (str) – The name of the model that is to be backed up

static backup_model(full_path: str) → None

Backup a model file.

The backed up file is saved with the original filename in the original location with .bk appended to the end of the name.

Parameters:: full_path (str) – The full path to a .keras model file or a .json state file
Return type:: None

restore() → None

Restores a model from backup.

The original model files are migrated into a folder within the original model folder named <model_name>_archived_<timestamp>. The .bk backup files are then moved to the location of the previously existing model files. Logs that were generated after the the last backup was taken are removed.

Return type:: None

snapshot_models(iterations: int) → None

Take a snapshot of the model at the current state and back it up.

The snapshot is a copy of the model folder located in the same root location as the original model file, with the number of iterations appended to the end of the folder name.

Parameters:: iterations (int) – The number of iterations that the model has trained when performing the snapshot.
Return type:: None

Classes

Backup(model_dir, model_name)

Performs the back up of models at each save iteration, and the restoring of models from their back up location.

lib.model.initializers Module 

Custom Initializers for faceswap.py

class lib.model.initializers.ConvolutionAware(eps_std: float = 0.05, seed: int | None = None, initialized: bool = False)

Initializer that generates orthogonal convolution filters in the Fourier space. If this initializer is passed a shape that is not 3D or 4D, orthogonal initialization will be used.

Adapted, fixed and optimized from: https://github.com/keras-team/keras-contrib/blob/master/keras_contrib/initializers/convaware.py

Parameters:

eps_std (float) – The Standard deviation for the random normal noise used to break symmetry in the inverse Fourier transform. Default: 0.05
seed (int | None) – Used to seed the random generator. Default: None
initialized (bool) – This should always be set to False. To avoid Keras re-calculating the values every time the model is loaded, this parameter is internally set on first time initialization. Default:False

Return type:

The modified kernel weights

References

Armen Aghajanyan, https://arxiv.org/abs/1702.06295

__call__(shape: list[int] | tuple[int, ...], dtype: str | None = None) → Tensor

Call function for the ICNR initializer.

Parameters:

shape (list[int] | tuple[int, ...]) – The required shape for the output tensor
dtype (str | None) – The data type for the tensor

Return type:

The modified kernel weights

get_config() → dict[str, Any]

Return the Convolutional Aware Initializer configuration.

Return type:: The configuration for Convolutional Aware Initialization

class lib.model.initializers.ICNR(initializer: dict[str, Any] | Initializer, scale: int = 2)

ICNR initializer for checkerboard artifact free sub pixel convolution

Parameters:

initializer (dict[str, T.Any] | initializers.Initializer) – The initializer used for sub kernels (orthogonal, glorot uniform, etc.)
scale (int) – scaling factor of sub pixel convolution (up sampling from 8x8 to 16x16 is scale 2). Default: 2

Return type:

The modified kernel weights

Example

>>> x = conv2d(... weights_initializer=ICNR(initializer=he_uniform(), scale=2))

References

Andrew Aitken et al. Checkerboard artifact free sub-pixel convolution https://arxiv.org/pdf/1707.02937.pdf, https://distill.pub/2016/deconv-checkerboard/ https://gist.github.com/A03ki/2305398458cb8e2155e8e81333f0a965

__call__(shape: list[int] | tuple[int, ...], dtype: str | None = 'float32') → Tensor

Returns a tensor object initialized as specified by the initializer.

Parameters:

shape (list[int] | tuple[int, ...]) – Shape of the tensor.
dtype (str | None) – Optional dtype of the tensor.

Return type:

Tensor

get_config() → dict[str, Any]

Return the ICNR Initializer configuration.

Return type:: The configuration for ICNR Initialization

Classes

`ConvolutionAware`([eps_std, seed, initialized])	Initializer that generates orthogonal convolution filters in the Fourier space.
`ICNR`(initializer[, scale])	ICNR initializer for checkerboard artifact free sub pixel convolution

Class Inheritance Diagram

Inheritance diagram of lib.model.initializers.ConvolutionAware, lib.model.initializers.ICNR

lib.model.layers Module 

Custom Layers for faceswap.py.

class lib.model.layers.GlobalMinPooling2D(*args, **kwargs)

Global minimum pooling operation for spatial data.

Parameters:: data_format (str | None)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

class lib.model.layers.GlobalStdDevPooling2D(*args, **kwargs)

Global standard deviation pooling operation for spatial data.

Parameters:: data_format (str | None)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

class lib.model.layers.KResizeImages(*args, **kwargs)

A custom upscale function that uses keras.backend.resize_images to upsample.

Parameters:

size (int or float, optional) – The scale to upsample to. Default: 2
interpolation (["nearest", "bilinear"], optional) – The interpolation to use. Default: “nearest”
kwargs (dict) – The standard Keras Layer keyword arguments (if any)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Call the upsample layer

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Computes the output shape of the layer.

This is the input shape with size dimensions multiplied by size

Parameters:: input_shape (tuple or list of tuples) – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:: An input shape tuple
Return type:: tuple

get_config() → dict[str, Any]

Returns the config of the layer.

Returns:: A python dictionary containing the layer configuration
Return type:: dict

class lib.model.layers.L2Normalize(*args, **kwargs)

Normalizes a tensor w.r.t. the L2 norm alongside the specified axis.

Parameters:

axis (int) – The axis to perform normalization across
kwargs (dict) – The standard Keras Layer keyword arguments (if any)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Compute the output shape based on the input shape.

Parameters:: input_shape (tuple) – The input shape to the layer
Return type:: tuple[int, …]

get_config() → dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:: A python dictionary containing the layer configuration
Return type:: dict

class lib.model.layers.PixelShuffler(*args, **kwargs)

PixelShuffler layer for Keras.

This layer requires a Convolution2D prior to it, having output filters computed according to the formula \(filters = k * (scale_factor * scale_factor)\) where k is a user defined number of filters (generally larger than 32) and scale_factor is the up-scaling factor (generally 2).

This layer performs the depth to space operation on the convolution filters, and returns a tensor with the size as defined below.

Notes

In practice, it is useful to have a second convolution layer after the PixelShuffler layer to speed up the learning process. However, if you are stacking multiple PixelShuffler blocks, it may increase the number of parameters greatly, so the Convolution layer after PixelShuffler layer can be removed.

Example

>>> # A standard sub-pixel up-scaling block
>>> x = Convolution2D(256, 3, 3, padding="same", activation="relu")(...)
>>> u = PixelShuffler(size=(2, 2))(x)
[Optional]
>>> x = Convolution2D(256, 3, 3, padding="same", activation="relu")(u)

Parameters:

size (tuple, optional) – The (h, w) scaling factor for up-scaling. Default: (2, 2)
data_format ([“channels_first”, “channels_last”, None], optional) – The data format for the input. Default: None
kwargs (dict) – The standard Keras Layer keyword arguments (if any)

References

https://gist.github.com/t-ae/6e1016cc188104d123676ccef3264981

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int | None, ...]) → tuple[int | None, ...]

Computes the output shape of the layer.

Assumes that the layer will be built to match that input shape provided.

Parameters:: input_shape (tuple or list of tuples) – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:: An input shape tuple
Return type:: tuple

get_config() → dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:: A python dictionary containing the layer configuration
Return type:: dict

class lib.model.layers.QuickGELU(*args, **kwargs)

Applies GELU approximation that is fast but somewhat inaccurate.

Parameters:

name (str, optional) – The name for the layer. Default: “QuickGELU”
kwargs (dict) – The standard Keras Layer keyword arguments (if any)

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Call the QuickGELU layer

Parameters:: inputs (keras.KerasTensor) – The input Tensor
Returns:: The output Tensor
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Compute the output shape based on the input shape.

Parameters:: input_shape (tuple) – The input shape to the layer
Return type:: tuple[int, …]

class lib.model.layers.ReflectionPadding2D(*args, **kwargs)

Reflection-padding layer for 2D input (e.g. picture).

This layer can add rows and columns at the top, bottom, left and right side of an image tensor.

Parameters:

stride (int, optional) – The stride of the following convolution. Default: 2
kernel_size (int, optional) – The kernel size of the following convolution. Default: 5
kwargs (dict) – The standard Keras Layer keyword arguments (if any)

build(input_shape: KerasTensor) → None

Creates the layer weights.

Must be implemented on all layers that have weights.

Parameters:: input_shape (keras.KerasTensor) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.
Return type:: None

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(*args, **kwargs) → tuple[int | None, ...]

Computes the output shape of the layer.

Assumes that the layer will be built to match that input shape provided.

Returns:: An input shape tuple
Return type:: tuple

get_config() → dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:: A python dictionary containing the layer configuration
Return type:: dict

class lib.model.layers.ScalarOp(*args, **kwargs)

A layer for scalar operations for migrating TFLambdaOps in Keras 2 models to Keras 3. This layer should not be used directly

Parameters:

operation (Literal["multiply", "truediv", "add", "subtract"]) – The scalar operation to perform
value (float) – The scalar value to use

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Call the Scalar operation function.

Parameters:: inputs (tensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Output shape is the same as the input shape.

Parameters:: input_shape (tuple) – The input shape to the layer
Return type:: tuple[int, …]

get_config(): Returns the config of the layer. :returns: A python dictionary containing the layer configuration :rtype: dict

class lib.model.layers.Swish(*args, **kwargs)

Swish Activation Layer implementation for Keras.

Parameters:

beta (float, optional) – The beta value to apply to the activation function. Default: 1.0
kwargs (dict) – The standard Keras Layer keyword arguments (if any)

References

Swish: a Self-Gated Activation Function: https://arxiv.org/abs/1710.05941v1

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Call the Swish Activation function.

Parameters:: inputs (tensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Compute the output shape based on the input shape.

Parameters:: input_shape (tuple) – The input shape to the layer
Return type:: tuple[int, …]

get_config()

Returns the config of the layer.

Adds the beta to config.

Returns:: A python dictionary containing the layer configuration
Return type:: dict

Classes

`GlobalMinPooling2D`(args, *kwargs)	Global minimum pooling operation for spatial data.
`GlobalStdDevPooling2D`(args, *kwargs)	Global standard deviation pooling operation for spatial data.
`KResizeImages`(args, *kwargs)	A custom upscale function that uses `keras.backend.resize_images` to upsample.
`L2Normalize`(args, *kwargs)	Normalizes a tensor w.r.t.
`PixelShuffler`(args, *kwargs)	PixelShuffler layer for Keras.
`QuickGELU`(args, *kwargs)	Applies GELU approximation that is fast but somewhat inaccurate.
`ReflectionPadding2D`(args, *kwargs)	Reflection-padding layer for 2D input (e.g. picture).
`ScalarOp`(args, *kwargs)	A layer for scalar operations for migrating TFLambdaOps in Keras 2 models to Keras 3.
`Swish`(args, *kwargs)	Swish Activation Layer implementation for Keras.

Class Inheritance Diagram

Inheritance diagram of lib.model.layers.GlobalMinPooling2D, lib.model.layers.GlobalStdDevPooling2D, lib.model.layers.KResizeImages, lib.model.layers.L2Normalize, lib.model.layers.PixelShuffler, lib.model.layers.QuickGELU, lib.model.layers.ReflectionPadding2D, lib.model.layers.ScalarOp, lib.model.layers.Swish

lib.model.nn_blocks Module 

Neural Network Blocks for faceswap.py.

class lib.model.nn_blocks.Conv2D(*args, padding: str = 'same', is_upscale: bool = False, **kwargs)

A standard Keras Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.

Parameters are the same, with the same defaults, as a standard keras.layers.Conv2D except where listed below. The default initializer is updated to HeUniform or convolutional aware based on user configuration settings.

Parameters:

padding (str, optional) – One of “valid” or “same” (case-insensitive). Default: “same”. Note that “same” is slightly inconsistent across backends with strides != 1, as described here.
is_upscale (bool, optional) – True if the convolution is being called from an upscale layer. This causes the instance to check the user configuration options to see if ICNR initialization has been selected and should be applied. This should only be passed in as True from UpscaleBlock layers. Default: False

__call__(*args, **kwargs) → KerasTensor

Call the Conv2D layer

Parameters:

args (tuple) – Standard Conv2D layer call arguments
kwargs (dict[str, Any]) – Standard Conv2D layer call keyword arguments

Returns:

The Tensor from the Conv2D layer

Return type:

class: keras.KerasTensor

class lib.model.nn_blocks.Conv2DBlock(filters: int, kernel_size: int | tuple[int, int] = 5, strides: int | tuple[int, int] = 2, padding: str = 'same', normalization: str | None = None, activation: str | None = 'leakyrelu', use_depthwise: bool = False, relu_alpha: float = 0.1, **kwargs)

A standard Convolution 2D layer which applies user specified configuration to the layer.

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Adds instance normalization if requested. Adds a LeakyReLU if a residual block follows.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. NB: If use_depthwise is True then a value must still be provided here, but it will be ignored. Default: 5
strides (tuple or int, optional) – An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Default: 2
padding (["valid", "same"], optional) – The padding to use. NB: If reflect padding has been selected in the user configuration options, then this argument will be ignored in favor of reflect padding. Default: “same”
normalization (str or None, optional) – Normalization to apply after the Convolution Layer. Select one of “batch” or “instance”. Set to None to not apply normalization. Default: None
activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”
use_depthwise (bool, optional) – Set to True to use a Depthwise Convolution 2D layer rather than a standard Convolution 2D layer. Default: False
relu_alpha (float) – The alpha to use for LeakyRelu Activation. Default=`0.1`
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Convolutional Layer.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Convolution 2D Layer
Return type:: keras.KerasTensor

class lib.model.nn_blocks.Conv2DOutput(filters: int, kernel_size: int | tuple[int], activation: str = 'sigmoid', padding: str = 'same', **kwargs)

A Convolution 2D layer that separates out the activation layer to explicitly set the data type on the activation to float 32 to fully support mixed precision training.

The Convolution 2D layer uses default parameters to be more appropriate for Faceswap architecture.

Parameters are the same, with the same defaults, as a standard keras.layers.Conv2D except where listed below. The default initializer is updated to HeUniform or convolutional aware based on user config settings.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int or tuple/list of 2 ints) – The height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.
activation (str, optional) – The activation function to apply to the output. Default: “sigmoid”
padding (str, optional) –
One of “valid” or “same” (case-insensitive). Default: “same”. Note that “same” is slightly inconsistent across backends with strides != 1, as described here.
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Convolutional Output Layer.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Convolution 2D Layer
Return type:: keras.KerasTensor

class lib.model.nn_blocks.DepthwiseConv2D(*args, padding: str = 'same', is_upscale: bool = False, **kwargs)

A standard Keras Depthwise Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.

Parameters are the same, with the same defaults, as a standard keras.layers.DepthwiseConv2D except where listed below. The default initializer is updated to HeUniform or convolutional aware based on user configuration settings.

Parameters:

padding (str, optional) –
One of “valid” or “same” (case-insensitive). Default: “same”. Note that “same” is slightly inconsistent across backends with strides != 1, as described here.
is_upscale (bool, optional) – True if the convolution is being called from an upscale layer. This causes the instance to check the user configuration options to see if ICNR initialization has been selected and should be applied. This should only be passed in as True from UpscaleBlock layers. Default: False

__call__(*args, **kwargs) → KerasTensor

Call the DepthwiseConv2D layer

Parameters:

args (tuple) – Standard DepthwiseConv2D layer call arguments
kwargs (dict[str, Any]) – Standard DepthwiseConv2D layer call keyword arguments

Returns:

The Tensor from the DepthwiseConv2D layer

Return type:

class: keras.KerasTensor

class lib.model.nn_blocks.ResidualBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', **kwargs)

Residual block from dfaker.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3
padding (["valid", "same"], optional) – The padding to use. Default: “same”
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

Returns:

The output tensor from the Upscale layer

Return type:

tensor

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Residual Block.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Upscale Layer
Return type:: keras.KerasTensor

class lib.model.nn_blocks.SeparableConv2DBlock(filters: int, kernel_size: int | tuple[int, int] = 5, strides: int | tuple[int, int] = 2, **kwargs)

Seperable Convolution Block.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 5
strides (tuple or int, optional) – An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Default: 2
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Separable Convolutional 2D layer

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Separable Convolutional 2D Block.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Upscale Layer
Return type:: keras.KerasTensor

class lib.model.nn_blocks.Upscale2xBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', activation: str | None = 'leakyrelu', interpolation: str = 'bilinear', sr_ratio: float = 0.5, scale_factor: int = 2, fast: bool = False, **kwargs)

Custom hybrid upscale layer for sub-pixel up-scaling.

Most of up-scaling is approximating lighting gradients which can be accurately achieved using linear fitting. This layer attempts to improve memory consumption by splitting with bilinear and convolutional layers so that the sub-pixel update will get details whilst the bilinear filter will get lighting.

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3
padding (["valid", "same"], optional) – The padding to use. Default: “same”
activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”
interpolation (["nearest", "bilinear"], optional) – Interpolation to use for up-sampling. Default: “bilinear”
scale_factor (int, optional) – The amount to upscale the image. Default: 2
sr_ratio (float, optional) – The proportion of super resolution (pixel shuffler) filters to use. Non-fast mode only. Default: 0.5
fast (bool, optional) – Use a faster up-scaling method that may appear more rugged. Default: False
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Upscale 2x Layer.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Upscale Layer
Return type:: keras.KerasTensor

class lib.model.nn_blocks.UpscaleBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', scale_factor: int = 2, normalization: str | None = None, activation: str | None = 'leakyrelu', **kwargs)

An upscale layer for sub-pixel up-scaling.

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3
padding (["valid", "same"], optional) – The padding to use. NB: If reflect padding has been selected in the user configuration options, then this argument will be ignored in favor of reflect padding. Default: “same”
scale_factor (int, optional) – The amount to upscale the image. Default: 2
normalization (str or None, optional) – Normalization to apply after the Convolution Layer. Select one of “batch” or “instance”. Set to None to not apply normalization. Default: None
activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Convolutional Layer.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Upscale Layer
Return type:: keras.KerasTensor

class lib.model.nn_blocks.UpscaleDNYBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', activation: str | None = 'leakyrelu', size: int = 2, interpolation: str = 'bilinear', **kwargs)

Upscale block that implements methodology similar to the Disney Research Paper using an upsampling2D block and 2 x convolutions

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

References

https://studios.disneyresearch.com/2020/06/29/high-resolution-neural-face-swapping-for-visual-effects/

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3
activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”
size (int, optional) – The amount to upscale the image. Default: 2
interpolation (["nearest", "bilinear"], optional) – Interpolation to use for up-sampling. Default: “bilinear”
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layers
padding (str)

__call__(inputs: KerasTensor) → KerasTensor

Call the UpscaleDNY block

Parameters:: inputs (keras.KerasTensor) – The input to the block
Returns:: The output from the block
Return type:: keras.KerasTensor

class lib.model.nn_blocks.UpscaleResizeImagesBlock(filters: int, kernel_size: int | tuple[int, int] = 3, padding: str = 'same', activation: str | None = 'leakyrelu', scale_factor: int = 2, interpolation: Literal['nearest', 'bilinear'] = 'bilinear')

Upscale block that uses the Keras Backend function resize_images to perform the up scaling Similar in methodology to the Upscale2xBlock

Adds reflection padding if it has been selected by the user, and other post-processing if requested by the plugin.

Parameters:

filters (int) – The dimensionality of the output space (i.e. the number of output filters in the convolution)
kernel_size (int, optional) – An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. Default: 3
padding (["valid", "same"], optional) – The padding to use. Default: “same”
activation (str or None, optional) – The activation function to use. This is applied at the end of the convolution block. Select one of “leakyrelu”, “prelu” or “swish”. Set to None to not apply an activation function. Default: “leakyrelu”
scale_factor (int, optional) – The amount to upscale the image. Default: 2
interpolation (["nearest", "bilinear"], optional) – Interpolation to use for up-sampling. Default: “bilinear”
kwargs (dict) – Any additional Keras standard layer keyword arguments to pass to the Convolutional 2D layer

__call__(inputs: KerasTensor) → KerasTensor

Call the Faceswap Resize Images Layer.

Parameters:: inputs (keras.KerasTensor) – The input to the layer
Returns:: The output tensor from the Upscale Layer
Return type:: keras.KerasTensor

lib.model.nn_blocks.reset_naming() → None

Reset the naming convention for nn_block layers to start from 0

Used when a model needs to be rebuilt and the names for each build should be identical

Return type:: None

Functions

reset_naming()

Reset the naming convention for nn_block layers to start from 0

Classes

`Conv2D`(*args[, padding, is_upscale])	A standard Keras Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.
`Conv2DBlock`(filters[, kernel_size, strides, ...])	A standard Convolution 2D layer which applies user specified configuration to the layer.
`Conv2DOutput`(filters, kernel_size[, ...])	A Convolution 2D layer that separates out the activation layer to explicitly set the data type on the activation to float 32 to fully support mixed precision training.
`DepthwiseConv2D`(*args[, padding, is_upscale])	A standard Keras Depthwise Convolution 2D layer with parameters updated to be more appropriate for Faceswap architecture.
`ResidualBlock`(filters[, kernel_size, padding])	Residual block from dfaker.
`SeparableConv2DBlock`(filters[, kernel_size, ...])	Seperable Convolution Block.
`Upscale2xBlock`(filters[, kernel_size, ...])	Custom hybrid upscale layer for sub-pixel up-scaling.
`UpscaleBlock`(filters[, kernel_size, ...])	An upscale layer for sub-pixel up-scaling.
`UpscaleDNYBlock`(filters[, kernel_size, ...])	Upscale block that implements methodology similar to the Disney Research Paper using an upsampling2D block and 2 x convolutions
`UpscaleResizeImagesBlock`(filters[, ...])	Upscale block that uses the Keras Backend function resize_images to perform the up scaling Similar in methodology to the `Upscale2xBlock`

lib.model.normalization Module 

Normalization methods for faceswap.py specific to Torch backend

class lib.model.normalization.AdaInstanceNormalization(*args, **kwargs)

Adaptive Instance Normalization Layer for Keras.

Parameters:

axis (int, optional) – The axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=”channels_first”, set axis=1 in InstanceNormalization. Setting axis=None will normalize all values in each instance of the batch. Axis 0 is the batch dimension. axis cannot be set to 0 to avoid errors. Default: None
momentum (float, optional) – Momentum for the moving mean and the moving variance. Default: 0.99
epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-3
center (bool, optional) – If True, add offset of beta to normalized tensor. If False, beta is ignored. Default: True
scale (bool, optional) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. relu), this can be disabled since the scaling will be done by the next layer. Default: True

References

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization - https://arxiv.org/abs/1703.06868

build(input_shape: tuple[tuple[int, ...], ...]) → None

Creates the layer weights.

Parameters:: input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.
Return type:: None

call(inputs: KerasTensor) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → int

Calculate the output shape from this layer.

Parameters:: input_shape (tuple) – The input shape to the layer
Returns:: The output shape to the layer
Return type:: int

get_config() → dict[str, Any]

Returns the config of the layer.

The Keras configuration for the layer.

Returns:: A python dictionary containing the layer configuration
Return type:: dict[str, Any]

class lib.model.normalization.GroupNormalization(*args, **kwargs)

Group Normalization

Parameters:

axis (int, optional) – The axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=”channels_first”, set axis=1 in InstanceNormalization. Setting axis=None will normalize all values in each instance of the batch. Axis 0 is the batch dimension. axis cannot be set to 0 to avoid errors. Default: None
gamma_init (str, optional) – Initializer for the gamma weight. Default: “one”
beta_init (str, optional) – Initializer for the beta weight. Default “zero”
gamma_regularizer (varies, optional) – Optional regularizer for the gamma weight. Default: None
beta_regularizer (varies, optional) – Optional regularizer for the beta weight. Default None
epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-3
group (int, optional) – The group size. Default: 32
data_format (["channels_first", "channels_last"], optional) – The required data format. Optional. Default: None
kwargs (dict) – Any additional standard Keras Layer key word arguments

References

Shaoanlu GAN: https://github.com/shaoanlu/faceswap-GAN

build(input_shape: tuple[int, ...]) → None

Creates the layer weights.

Parameters:: input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.
Return type:: None

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Calculate the output shape from this layer.

Parameters:: input_shape (tuple) – The input shape to the layer
Returns:: The output shape to the layer
Return type:: int

get_config() → dict[str, Any]

Returns the config of the layer.

The Keras configuration for the layer.

Returns:: A python dictionary containing the layer configuration
Return type:: dict[str, Any]

class lib.model.normalization.InstanceNormalization(*args, **kwargs)

Instance normalization layer (Lei Ba et al, 2016, Ulyanov et al., 2016).

Normalize the activations of the previous layer at each step, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.

Parameters:

axis (int, optional) – The axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format=”channels_first”, set axis=1 in InstanceNormalization. Setting axis=None will normalize all values in each instance of the batch. Axis 0 is the batch dimension. axis cannot be set to 0 to avoid errors. Default: None
epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-3
center (bool, optional) – If True, add offset of beta to normalized tensor. If False, beta is ignored. Default: True
scale (bool, optional) – If True, multiply by gamma. If False, gamma is not used. When the next layer is linear (also e.g. relu), this can be disabled since the scaling will be done by the next layer. Default: True
beta_initializer (str, optional) – Initializer for the beta weight. Default: “zeros”
gamma_initializer (str, optional) – Initializer for the gamma weight. Default: “ones”
beta_regularizer (str, optional) – Optional regularizer for the beta weight. Default: None
gamma_regularizer (str, optional) – Optional regularizer for the gamma weight. Default: None
beta_constraint (float, optional) – Optional constraint for the beta weight. Default: None
gamma_constraint (float, optional) – Optional constraint for the gamma weight. Default: None

References

Layer Normalization - https://arxiv.org/abs/1607.06450
Instance Normalization: The Missing Ingredient for Fast Stylization - https://arxiv.org/abs/1607.08022

build(input_shape: tuple[int, ...]) → None

Creates the layer weights.

Parameters:: input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.
Return type:: None

call(inputs: KerasTensor) → KerasTensor

This is where the layer’s logic lives.

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

Calculate the output shape from this layer.

Parameters:: input_shape (tuple) – The input shape to the layer
Returns:: The output shape to the layer
Return type:: int

get_config() → dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:: A python dictionary containing the layer configuration
Return type:: dict[str, Any]

class lib.model.normalization.RMSNormalization(*args, **kwargs)

Root Mean Square Layer Normalization (Biao Zhang, Rico Sennrich, 2019)

RMSNorm is a simplification of the original layer normalization (LayerNorm). LayerNorm is a regularization technique that might handle the internal covariate shift issue so as to stabilize the layer activations and improve model convergence. It has been proved quite successful in NLP-based model. In some cases, LayerNorm has become an essential component to enable model optimization, such as in the SOTA NMT model Transformer.

RMSNorm simplifies LayerNorm by removing the mean-centering operation, or normalizing layer activations with RMS statistic.

Parameters:

axis (int) – The axis to normalize across. Typically this is the features axis. The left-out axes are typically the batch axis/axes. This argument defaults to -1, the last dimension in the input.
epsilon (float, optional) – Small float added to variance to avoid dividing by zero. Default: 1e-8
partial (float, optional) – Partial multiplier for calculating pRMSNorm. Valid values are between 0.0 and 1.0. Setting to 0.0 or 1.0 disables. Default: 0.0
bias (bool, optional) – Whether to use a bias term for RMSNorm. Disabled by default because RMSNorm does not enforce re-centering invariance. Default False
kwargs (dict) – Standard keras layer kwargs

References

RMS Normalization - https://arxiv.org/abs/1910.07467
Official implementation - https://github.com/bzhangGo/rmsnorm

build(input_shape: tuple[int, ...]) → None

Validate and populate axis

Parameters:: input_shape (tuple[int, ...]) – Keras tensor (future input to layer) or list/tuple of Keras tensors to reference for weight shape computations.
Return type:: None

call(inputs: KerasTensor, *args, **kwargs) → KerasTensor

Call Root Mean Square Layer Normalization

Parameters:: inputs (keras.KerasTensor) – Input tensor, or list/tuple of input tensors
Returns:: A tensor or list/tuple of tensors
Return type:: keras.KerasTensor

compute_output_shape(input_shape: tuple[int, ...]) → tuple[int, ...]

The output shape of the layer is the same as the input shape.

Parameters:: input_shape (tuple[int, ...]) – The input shape to the layer
Returns:: The output shape to the layer
Return type:: tuple[int, …]

get_config() → dict[str, Any]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstated later (without its trained weights) from this configuration.

The configuration of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Returns:: A python dictionary containing the layer configuration
Return type:: dict[str, Any]

Classes

`AdaInstanceNormalization`(args, *kwargs)	Adaptive Instance Normalization Layer for Keras.
`GroupNormalization`(args, *kwargs)	Group Normalization
`InstanceNormalization`(args, *kwargs)	Instance normalization layer (Lei Ba et al, 2016, Ulyanov et al., 2016).
`RMSNormalization`(args, *kwargs)	Root Mean Square Layer Normalization (Biao Zhang, Rico Sennrich, 2019)

Class Inheritance Diagram

Inheritance diagram of lib.model.normalization.AdaInstanceNormalization, lib.model.normalization.GroupNormalization, lib.model.normalization.InstanceNormalization, lib.model.normalization.RMSNormalization