lib.training package

The training Package handles libraries to assist with training a model

lib.training.loss Module 

Handles the collation, weighting masking and calculation of the selected Loss functions for training Faceswap models

class lib.training.loss.BatchLoss(unweighted: list[dict[str, Tensor]], weighted: list[dict[str, Tensor]], mask: Tensor | None = None)

Dataclass for holding Loss values for a batch of data

Parameters:

unweighted (list[dict[str, Tensor]])
weighted (list[dict[str, Tensor]])
mask (Tensor | None)

mask: Tensor | None = None: The loss scalar for the mask for each item in the batch if learn_mask is selected otherwise None. Default: None

to_cpu() → Self

Detaches all contained loss values and moves them to CPU

Return type:: This object with all tensors detached and moved to CPU

property total: Tensor: The total single weighted loss scalar for all items in the batch for backprop

unweighted: list[dict[str, Tensor]] = <dataclasses._MISSING_TYPE object>: For each side output, the unweighted loss scalars for each function for each item in the batch

weighted: list[dict[str, Tensor]] = <dataclasses._MISSING_TYPE object>: For each side output, the weighted loss scalars for each function for each item in the batch

class lib.training.loss.LossCollator(functions: list[str], weights: list[float], color_order: Literal['bgr', 'rgb'], use_mask: bool, eye_multiplier: float, mouth_multiplier: float, smallest_output: int, mask_loss: str | None = None)

Compiles the chosen loss functions and calculates the values in the training loop

Parameters:

functions (list[str]) – List of lost function names from configuration file to collate for loss calculation
weights (list[float]) – List of weights, corresponding to the the list of functions, to apply to each loss function
color_order (T.Literal['bgr', 'rgb']) – The color order that the model is training in
use_mask (bool) – True if loss should be masked as penalize mask loss has been selected
eye_multiplier (float) – The amount of extra weighting to apply to the eye area
mouth_multiplier (float) – The amount of extra weighting to apply to the mouth area
smallest_output (int) – The smallest output from the model. Required for initializing some loss functions
mask_loss (str | None) – The loss function to use if learn_mask is enabled. Default: None (not enabled)

forward(y_true_all: list[torch.Tensor], y_pred_all: list[torch.Tensor], meta: BatchMeta) → BatchLoss

Call the loss functions, reduce to batch dimension, apply masks and weighting and obtain the weighted and unweighted per function values and the weighted total loss scalar

Parameters:

y_true_all (list[torch.Tensor]) – The ground truth batch of images for all outputs for a side of the model
y_pred_all (list[torch.Tensor]) – The batch of model predictions for all outputs for a side of the model
meta (BatchMeta) – The meta information for the batch

Return type:

The loss scalars for the batch

Classes 

`BatchLoss`(unweighted, weighted[, mask])	Dataclass for holding Loss values for a batch of data
`LossCollator`(functions, weights, ...[, ...])	Compiles the chosen loss functions and calculates the values in the training loop

Class Inheritance Diagram 

Inheritance diagram of lib.training.loss.BatchLoss, lib.training.loss.LossCollator

lib.training.lr_finder Module 

Learning Rate Finder for faceswap.py.

class lib.training.lr_finder.LRStrength(*values): Enum for how aggressively to set the optimal learning rate

class lib.training.lr_finder.LearningRateFinder(trainer: train.Trainer, scheduler: ExponentialLR, steps: int, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit'], stop_factor: int = 4, beta: float = 0.98)

Learning Rate Finder

Parameters:

trainer (train.Trainer) – The training loop with the loaded training plugin
scheduler (ExponentialLR) – The LRFinder scheduler
steps (int) – The number of steps to run the finder for
strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate
mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in
stop_factor (int) – When to stop finding the optimal learning rate
beta (float) – Amount to smooth loss by, for graphing purposes

property best_lr: None | float: The discovered best learning rate or None if not found

find() → None

Find the optimal learning rate

Return type:: None

Classes 

`LRStrength`(*values)	Enum for how aggressively to set the optimal learning rate
`LearningRateFinder`(trainer, scheduler, ...)	Learning Rate Finder

Class Inheritance Diagram 

Inheritance diagram of lib.training.lr_finder.LRStrength, lib.training.lr_finder.LearningRateFinder

lib.training.lr_warmup Module 

Handles Learning Rate Warmup when training a model

class lib.training.lr_warmup.WarmupScheduler(optimizer: Optimizer, steps: int, last_epoch: int = -1)

Handles the updating of the model’s learning rate during Learning Rate Warmup

Parameters:

optimizer (Optimizer) – The torch optimizer in use
steps (int) – The number of iterations to warmup the learning rate for
last_epoch (int) – The last step that was run (last_epoch is a misnomer inherited from PyTorch and actually refers to steps in our use case). Default: -1 (not yet started)

get_lr() → list[float | Tensor]

Get the learning rate for the current step

Return type:: The next learning rate for each parameter group for the next step

step(epoch=None) → None

If a learning rate update is required, update the model’s learning rate, otherwise do nothing

Parameters:: epoch – Deprecated argument from PyTorch that should always be None. Default: None
Return type:: None

steps: The total number of steps to warmup the LR for

Classes 

WarmupScheduler(optimizer, steps[, last_epoch])

Handles the updating of the model's learning rate during Learning Rate Warmup

lib.training.optimizer Module 

Wraps the selected Torch optimizer and handles optimizer related functions such as loss scaling, clipping and gradient accumulation

class lib.training.optimizer.GradClip(method: Literal['autoclip', 'global_norm', 'norm', 'value'], value: float, autoclip_history: int = 10000)

Handles the clipping of gradients based on user supplied parameters

Parameters:

method (T.Literal['autoclip', 'global_norm', 'norm', 'value']) – The clipping method to use
value (float) – The clipping value to use. For autoclip this is the percentile to clip at (a value of 1.0 will clip at the 10th percentile a value of 2.5 will clip at the 25th percentile etc)
autoclip_history (int) – The history length for auto clipping. Default: 10000

__call__(parameters: list[Parameter]) → None

Clip the given parameters by the chosen method

Parameters:: parameters (list[Parameter]) – The parameters to clip
Return type:: None

class lib.training.optimizer.Optimizer(model: Model, config: type[OptConfig], mixed_precision: bool = False, warmup_steps: int = 0)

Object for managing the selected Torch optimizer

Parameters:

model (Model) – The model that is to be trained
config (type[OptConfig]) – The optimizer user configuration options
mixed_precision (bool) – True to train using mixed precision. Default: False
warmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0

backward(loss: Tensor) → None

Perform the optimizer’s backward pass

Parameters:: loss (Tensor) – The loss scalar from the forward pass
Return type:: None

find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) → bool

Use the Learning Rate Finder to discover the optimal learning rate

Parameters:

trainer (Trainer) – The training loop with the loaded training plugin
steps (int) – The number of iterations to run the learning rate finder for
start_lr (float) – The learning rate to start scanning from
end_lr (float) – The final learning rate to scan until
strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate
mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

Return type:

True if an optimal learning rate was discovered.

load_state_dict(state_dict: dict[str, Any]) → None

Load the serialized data from a state dict into this object

Parameters:: state_dict (dict[str, Any]) – The serialized data to load
Return type:: None

set_lr(lr: float) → None

Manually assign the optimizer’s learning rate with the given value

Parameters:: lr (float) – The learning rate to apply to the optimizer
Return type:: None

state_dict() → dict[str, Any]

Serialized data as a dict for relevant options contained in this class

Return type:: The serialized data for this object for saving and loading

step() → None

Perform the optimizer step if valid and zero the gradients.

Handles gradient accumulation, scaling for mixed precision and gradient clipping

Return type:: None

to(device: torch.Device) → None

Place the optimizer onto the given device

Parameters:: device (torch.Device) – The device to place the optimizer on to
Return type:: None

lib.training.optimizer.get_parameter_group_ids(trainable_variables: list[Variable]) → dict[int, T.Literal['decay', 'no_decay']]

Obtain the index of each item in the keras model’s trainable weights that belong to each of the optimizer’s parameter groups (ie split by weights that take decay and don’t take decay)

Parameters:: trainable_variables (list[Variable]) – list of trainable variables from keras model
Return type:: dictionary of keras model’s trainable weight index to the name of the parameter group

Functions 

get_parameter_group_ids(trainable_variables)

Obtain the index of each item in the keras model's trainable weights that belong to each of the optimizer's parameter groups (ie split by weights that take decay and don't take decay)

Classes 

`GradClip`(method, value[, autoclip_history])	Handles the clipping of gradients based on user supplied parameters
`Optimizer`(model, config[, mixed_precision, ...])	Object for managing the selected Torch optimizer

lib.training.preview Module 

Handles the creation of display images for preview window and timelapses

class lib.training.preview.Samples(coverage_ratio: float, has_mask: bool, mask_opacity: int, mask_color: str)

Compile samples for display for preview and time-lapse

Parameters:

coverage_ratio (float) – Ratio of face to be cropped out of the training image.
has_mask (bool) – True if the model was trained with a mask
mask_opacity (int) – The opacity (as a percentage) to use for the mask overlay
mask_color (str) – The hex RGB value to use the mask overlay

get_preview(predictions: npt.NDArray[np.float32], targets: npt.NDArray[np.float32]) → npt.NDArray[np.uint8]

Compile a preview image.

Predictions: The (BGR) predictions shape: (src_side, dst_side, batch_size, height, width, channels)
targets: Full size BGR face patches at 100% coverage for patching predictions into in (A, B, …) order

Return type:

A compiled preview image ready for display or saving

Parameters:

predictions (npt.NDArray[np.float32])
targets (npt.NDArray[np.float32])

toggle_mask_display() → None

Toggle the mask overlay on or off depending on user input.

Return type:: None

Classes 

Samples(coverage_ratio, has_mask, ...)

Compile samples for display for preview and time-lapse

lib.training.preview_cv Module 

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_cv.PreviewBase(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], Event] | None = None)

Parent class for OpenCV and Tkinter Preview Windows

Parameters:

preview_buffer (PreviewBuffer) – The thread safe object holding the preview images
triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

class lib.training.preview_cv.PreviewBuffer

A thread safe class for holding preview images

add_image(name: str, image: np.ndarray) → None

Add an image to the preview buffer in a thread safe way

Parameters:

name (str)
image (np.ndarray)

Return type:

None

get_images() → Generator[tuple[str, np.ndarray], None, None]

Get the latest images from the preview buffer. When iterator is exhausted clears the updated event.

Yields:

name (str) – The name of the image
numpy.ndarray – The image in BGR format

Return type:

Generator[tuple[str, np.ndarray], None, None]

property is_updated: bool

True when new images have been loaded into the preview buffer

Type:: bool

class lib.training.preview_cv.PreviewCV(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], Event])

Simple fall back preview viewer using OpenCV for when TKinter is not available

Parameters:

preview_buffer (PreviewBuffer) – The thread safe object holding the preview images
triggers (dict) – Dictionary of event triggers for pop-up preview.

Classes 

`PreviewBase`(preview_buffer[, triggers])	Parent class for OpenCV and Tkinter Preview Windows
`PreviewBuffer`()	A thread safe class for holding preview images
`PreviewCV`(preview_buffer, triggers)	Simple fall back preview viewer using OpenCV for when TKinter is not available

Class Inheritance Diagram 

Inheritance diagram of lib.training.preview_cv.PreviewBase, lib.training.preview_cv.PreviewBuffer, lib.training.preview_cv.PreviewCV

lib.training.preview_tk Module 

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_tk.PreviewTk(preview_buffer: PreviewBuffer, parent: tk.Widget | None = None, taskbar: ttk.Frame | None = None, triggers: TriggerType | None = None)

Holds a preview window for displaying the pop out preview.

Parameters:

preview_buffer (PreviewBuffer) – The thread safe object holding the preview images
parent (tk.Widget | None) – If this viewer is being called from the GUI the parent widget should be passed in here. If this is a standalone pop-up window then pass None. Default: None
taskbar (ttk.Frame | None) – If this viewer is being called from the GUI the parent’s option frame should be passed in here. If this is a standalone pop-up window then pass None. Default: None
triggers (TriggerType | None) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

property master_frame: Frame: The master frame that holds the preview window

pack(*args, **kwargs)

Redirect calls to pack the widget to pack the actual _master_frame.

Takes standard tkinter.Frame pack arguments

remove_option_controls() → None

Remove the taskbar options controls when the preview is disabled in the GUI

Return type:: None

save(location: str) → None

Save action to be performed when save button pressed from the GUI.

Parameters:: location (str) – Full path to the folder to save the preview image to
Return type:: None

lib.training.preview_tk.main()

Load image from first given argument and display

python -m lib.training.preview_tk <filename>

Functions 

main()

Load image from first given argument and display

Classes 

PreviewTk(preview_buffer[, parent, taskbar, ...])

Holds a preview window for displaying the pop out preview.

Class Inheritance Diagram 

Inheritance diagram of lib.training.preview_tk.PreviewTk

lib.training.tensorboard Module 

Tensorboard call back for PyTorch logging. Hopefully temporary until a native Keras version is implemented

class lib.training.tensorboard.RecordIterator(log_file, is_live: bool = False)

A replacement for tensorflow’s compat.v1.io.tf_record_iterator()

Parameters:

log_file – The event log file to obtain records from
is_live (bool) – True if the log file is for a live training session that will constantly provide data. Default: False

__next__() → bytes

Get the next event log from a Tensorboard event file

Return type:: A Tensorboard event log
Raises:: StopIteration – When the event log is fully consumed

class lib.training.tensorboard.TorchTensorBoard(log_dir: str = 'logs', write_graph: bool = True, update_freq: Literal['batch', 'epoch'] | int = 'epoch')

Enable visualizations for TensorBoard. Adapted from Keras’ Tensorboard Callback keeping only the parts we need, and using Torch rather than TensorFlow

Parameters:

log_dir (str) – The path of the directory where to save the log files to be parsed by TensorBoard. e.g., log_dir = os.path.join(working_dir, ‘logs’). This directory should not be reused by any other callbacks.
write_graph (bool) – Whether to visualize the graph in TensorBoard. Note that the log file can become quite large when write_graph is set to True. Note: Not supported at this time
update_freq (T.Literal['batch', 'epoch'] | int) – When using “epoch”, writes the losses and metrics to TensorBoard after every epoch. If using an integer, let’s say 1000, all metrics and losses (including custom ones added by Model.compile) will be logged to TensorBoard every 1000 batches. “batch” is a synonym for 1, meaning that they will be written every batch. Note however that writing too frequently to TensorBoard can slow down your training, especially when used with distribution strategies as it will incur additional synchronization overhead. Batch- level summary writing is also available via train_step override. Please see [TensorBoard Scalars tutorial](https://www.tensorflow.org/tensorboard/scalars_and_keras#batch-level_logging)

on_save() → None

Flush data to disk on save

Return type:: None

on_train_batch_end(batch: int, logs: dict[str, float | dict[str, float]] | None = None) → None

Update Tensorboard logs on batch end

Parameters:

batch (int) – The current iteration count
logs (dict[str, float | dict[str, float]] | None) – The logs to write

Return type:

None

on_train_begin(logs=None) → None

Initialize the call back on train start

Parameters:: logs – Unused
Return type:: None

on_train_end(logs=None) → None

Close the writer on train completion

Parameters:: logs – Unused
Return type:: None

set_model(model: Model) → None

Sets Keras model and writes graph if specified.

Parameters:: model (Model) – The model that is being trained
Return type:: None

Classes 

`RecordIterator`(log_file[, is_live])	A replacement for tensorflow's `compat.v1.io.tf_record_iterator()`
`TorchTensorBoard`([log_dir, write_graph, ...])	Enable visualizations for TensorBoard.

Class Inheritance Diagram 

Inheritance diagram of lib.training.tensorboard.RecordIterator, lib.training.tensorboard.TorchTensorBoard

lib.training.train Module 

Run the training loop for a training plugin

class lib.training.train.Trainer(plugin: TrainerBase, preview: bool, warmup_steps: int = 0, timelapse_folders: list[str] | None = None, timelapse_output: str = '')

Handles the feeding of training images to Faceswap models, the generation of Tensorboard logs and the creation of sample/time-lapse preview images.

All Trainer plugins must inherit from this class.

Parameters:

plugin (TrainerBase) – The plugin that will be processing each batch
preview (bool) – True to generate previews
warmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0
timelapse_folders (list[str] | None) – The input folders to create timelapse images from. Default: None (no timelapse)
timelapse_output (str) – The folder to output timelapse images. Default: “” (no timelapse)

property exit_early: bool: True if the trainer should exit early, without performing any training steps

save(is_exit: bool = False) → None

Save the model

Parameters:: is_exit (bool) – True if save has been called on model exit. Default: False
Return type:: None

toggle_mask() → None

Toggle the mask overlay on or off based on user input.

Return type:: None

train_one_batch() → list[BatchLoss]

Process a single batch through the model and obtain the loss

Return type:: The collated loss values detached and moved to CPU in order (A, B, …)

train_one_step(viewer: Callable[[np.ndarray, str], None] | None, do_timelapse: bool = False) → None

Running training on a batch of images for each side.

Triggered from the training cycle in scripts.train.Train.

Runs a training batch through the model.
Outputs the iteration’s loss values to the console
Logs loss to Tensorboard, if logging is requested.
If a preview or time-lapse has been requested, then pushes sample images through the model to generate the previews
Creates a snapshot if the total iterations trained so far meet the requested snapshot criteria

Notes

As every iteration is called explicitly, the Parameters defined should always be None except on save iterations.

Parameters:

viewer (Callable[[np.ndarray, str], None] | None) – The function that will display the preview image
do_timelapse (bool) – True to generate a timelapse preview image

Return type:

None

Classes 

Trainer(plugin, preview[, warmup_steps, ...])

Handles the feeding of training images to Faceswap models, the generation of Tensorboard logs and the creation of sample/time-lapse preview images.

Class Inheritance Diagram 

Inheritance diagram of lib.training.train.Trainer

data package

lib.training.data.augmentation Module 

Processes the augmentation of images for feeding into a Faceswap model.

class lib.training.data.augmentation.ConstantsAugmentation(color: ConstantsColor, transform: ConstantsTransform, warp: ConstantsWarp)

Dataclass for holding constants for Image Augmentation.

Parameters:

color (ConstantsColor)
transform (ConstantsTransform)
warp (ConstantsWarp)

color

The constants for adjusting color/contrast in an image

Type:: lib.training.data.augmentation.ConstantsColor

transform

The constants for image transformation

Type:: lib.training.data.augmentation.ConstantsTransform

warp

The constants for image warping

Type:: lib.training.data.augmentation.ConstantsWarp

Dataclass should be initialized using its :func:`from_config` method

Example

>>> constants = ConstantsAugmentation.from_config(processing_size=256,
...                                               batch_size=16)

color: ConstantsColor = <dataclasses._MISSING_TYPE object>: The constants for adjusting color/contrast in an image

classmethod from_config(processing_size: int, batch_size: int) → ConstantsAugmentation

Create a new dataclass instance from user config

Parameters:

processing_size (int) – The size of image to augment the data for
batch_size (int) – The batch size that augmented data is being prepared for

Return type:

ConstantsAugmentation

transform: ConstantsTransform = <dataclasses._MISSING_TYPE object>: The constants for image transformation

warp: ConstantsWarp = <dataclasses._MISSING_TYPE object>: The constants for image warping

class lib.training.data.augmentation.ConstantsColor(clahe_base_contrast: int, clahe_chance: float, clahe_max_size: int, lab_adjust: ndarray)

Dataclass for holding constants for enhancing an image (ie contrast/color adjustment)

Parameters:

clahe_base_contrast (int) – The base number for Contrast Limited Adaptive Histogram Equalization
clahe_chance (float) – Probability to perform Contrast Limited Adaptive Histogram Equalization
clahe_max_size (int) – Maximum clahe window size
lab_adjust (numpy.ndarray) – Adjustment amounts for L*A*B augmentation

clahe_base_contrast: int = <dataclasses._MISSING_TYPE object>: The base number for Contrast Limited Adaptive Histogram Equalization

clahe_chance: float = <dataclasses._MISSING_TYPE object>: Probability to perform Contrast Limited Adaptive Histogram Equalization

clahe_max_size: int = <dataclasses._MISSING_TYPE object>: Maximum clahe window size

lab_adjust: ndarray = <dataclasses._MISSING_TYPE object>: Adjustment amounts for L*A*B augmentation

class lib.training.data.augmentation.ConstantsTransform(rotation: int, zoom: float, shift: float, flip: float)

Dataclass for holding constants for transforming an image

Parameters:

rotation (int) – Rotation range for transformations
zoom (float) – Zoom range for transformations
shift (float) – Shift range for transformations
flip (float)

flip: float = <dataclasses._MISSING_TYPE object>: The chance to flip an image

rotation: int = <dataclasses._MISSING_TYPE object>: Rotation range for transformations

shift: float = <dataclasses._MISSING_TYPE object>: Shift range for transformations

zoom: float = <dataclasses._MISSING_TYPE object>: Zoom range for transformations

class lib.training.data.augmentation.ConstantsWarp(maps: ndarray, pad: tuple[int, int], slices: slice, scale: float, lm_edge_anchors: ndarray, lm_grids: ndarray, lm_scale: float)

Dataclass for holding constants for warping an image

Parameters:

maps (numpy.ndarray) – The stacked (x, y) mappings for image warping
pad (tuple[int, int]) – The padding to apply for image warping
slices (slice) – The slices for extracting a warped image
lm_edge_anchors (numpy.ndarray) – The edge anchors for landmark based warping
lm_grids (numpy.ndarray) – The grids for landmark based warping
scale (float)
lm_scale (float)

lm_edge_anchors: ndarray = <dataclasses._MISSING_TYPE object>: The edge anchors for landmark based warping

lm_grids: ndarray = <dataclasses._MISSING_TYPE object>: The grids for landmark based warping

lm_scale: float = <dataclasses._MISSING_TYPE object>: The scaling to apply to landmark based warping

maps: ndarray = <dataclasses._MISSING_TYPE object>: The stacked (x, y) mappings for image warping

pad: tuple[int, int] = <dataclasses._MISSING_TYPE object>: The padding to apply for image warping

scale: float = <dataclasses._MISSING_TYPE object>: The scaling to apply to standard warping

slices: slice = <dataclasses._MISSING_TYPE object>: The slices for extracting a warped image

class lib.training.data.augmentation.ImageAugmentation(batch_size: int, processing_size: int)

Performs augmentation on batches of training images.

Parameters:

batch_size (int) – The number of images that will be fed through the augmentation functions at once.
processing_size (int) – The largest input or output size of the model. This is the size that images are processed at.

color_adjust(batch: ndarray) → ndarray

Perform color augmentation on the passed in batch.

The color adjustment parameters are set in config.train.ini

Parameters:: batch (ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format of uint8 dtype.
Return type:: A 4-dimensional array of the same shape as batch with color augmentation applied.

random_flip(batch: npt.NDArray[np.uint8], points: npt.NDArray[np.float32] | None) → None

Perform random horizontal flipping on the passed in batch.

The probability of flipping an image is set in config.train.ini

Parameters:

batch (npt.NDArray[np.uint8]) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.
points (npt.NDArray[np.float32] | None) – Any (x, y) points to transform. Can be in any shape but the final dimension should be shape 2. None if there are no points to transform

Return type:

None

transform(batch: npt.NDArray[np.uint8], points: npt.NDArray[np.float32] | None) → None

Perform random transformation on the passed in batch and optional (x, y) points.

The transformation parameters are set in config.train.ini

Parameters:

batch (npt.NDArray[np.uint8]) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.
points (npt.NDArray[np.float32] | None) – Any (x, y) points to transform. in shape (batch_size, num_sides, 68, 2). None if there are no points to transform

Return type:

None

warp(batch: ndarray, to_landmarks: bool = False, batch_src_points: ndarray | None = None, batch_dst_points: ndarray | None = None) → ndarray

Perform random warping on the passed in batch by one of two methods.

Parameters:

batch (ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.
to_landmarks (bool) – If False perform standard random warping of the input image. If True perform warping to semi-random similar corresponding landmarks from the other side. Default: False
batch_src_points (ndarray | None) – Only used when to_landmarks is True. A batch of 68 point landmarks for the source faces. This is a 3-dimensional array in the shape (batchsize, 68, 2). Default: None
batch_dst_points (ndarray | None) – Only used when to_landmarks is True. A batch of randomly chosen closest match destination faces landmarks. This is a 3-dimensional array in the shape (batchsize, 68, 2). Default None

Return type:

A 4-dimensional array of the same shape as batch with warping applied.

Classes 

`ConstantsAugmentation`(color, transform, warp)	Dataclass for holding constants for Image Augmentation.
`ConstantsColor`(clahe_base_contrast, ...)	Dataclass for holding constants for enhancing an image (ie contrast/color adjustment)
`ConstantsTransform`(rotation, zoom, shift, flip)	Dataclass for holding constants for transforming an image
`ConstantsWarp`(maps, pad, slices, scale, ...)	Dataclass for holding constants for warping an image
`ImageAugmentation`(batch_size, processing_size)	Performs augmentation on batches of training images.

Class Inheritance Diagram 

Inheritance diagram of lib.training.data.augmentation.ConstantsAugmentation, lib.training.data.augmentation.ConstantsColor, lib.training.data.augmentation.ConstantsTransform, lib.training.data.augmentation.ConstantsWarp, lib.training.data.augmentation.ImageAugmentation

lib.training.data.collate Module 

Handles collation of data for training faceswap models

class lib.training.data.collate.BatchMeta(mask_face: list[Tensor] | None = None, mask_eye: list[Tensor] | None = None, mask_mouth: list[Tensor] | None = None)

Dataclass that holds meta information required for training a batch of images

All lists are of len(number model outputs per side) with tensors in shape (batch_size, num_inputs, 1, H, W)

Parameters:

mask_face (list[Tensor] | None)
mask_eye (list[Tensor] | None)
mask_mouth (list[Tensor] | None)

mask_eye: list[Tensor] | None = None: The eye mask if eye loss multipliers > 1 for each output in NCHW order

mask_face: list[Tensor] | None = None: The selected face mask for penalized loss/learn mask for each output in NCHW order

mask_mouth: list[Tensor] | None = None: The mouth mask if mouth loss multipliers > 1 for each output in NCHW order

to(device: str | torch.Device) → T.Self

Place all contained tensors onto the given device

Parameters:: device (str | torch.Device) – The device to place the tensors on to
Return type:: This object with the tensors placed on the requested device

class lib.training.data.collate.Collate(input_size: int, output_sizes: tuple[int, ...], color_order: T.Literal['bgr', 'rgb'], config: TrainConfig, landmarks: LandmarkMatcher | None)

Collation function for processing a batch of samples into input and output tensors applying augmentation

Parameters:

input_size (int) – The pixel size of the model input
output_sizes (tuple[int, ...]) – The pixel sizes of the model output
color_order (T.Literal['bgr', 'rgb']) – The color order that the model expects
config (TrainConfig) – The training configuration for the model
landmarks (LandmarkMatcher | None) – The landmark matching object for the (A and B) sides of the model if warp_to_landmarks is enabled otherwise None

__call__(data: list[tuple[tuple[npt.NDArray[np.uint8], int], ...]]) → tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

Prepare the loaded samples for feeding the model, creating targets and applying augmentation

Parameters:

data (list[tuple[tuple[npt.NDArray[np.uint8], int], ...]]) – Batch of data tuples with the loaded stacked image and masks from each loader in the first position and the image file index for each item in the batch in the 2nd

Returns:

feed – list of len (num_inputs) tensors of shape(batch_size, H, W, C) inputs for the model
targets – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range
meta – The meta information for the batch

Return type:

tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

class lib.training.data.collate.LandmarkMatcher(folders: list[str], size: int, centering: CenteringType, coverage: float, y_offset: float, num_choices: int = 10)

Prepares landmarks when Warp-to-Landmarks is enabled.

2 sides (A/B) only.

For each side, stores the aligned landmarks for each side and collates the 10 nearest matches on the other side for random warping

Parameters:

folders (list[str]) – Two training folders for sides A and B
size (int) – The aligned face size to transform the landmarks to
centering (CenteringType) – The aligned centering to transform the landmarks to
coverage (float) – Additional coverage ratio to be applied
y_offset (float) – Additional vertical offset to be applied
num_choices (int) – Number of choices from the opposite side to cache for each landmark. Default: 10

get_close_landmarks(indices: npt.NDArray[np.int64]) → npt.NDArray[np.float32]

For the given image indices, obtain a randomly selected close match landmarks from the other side

Parameters:

indices (npt.NDArray[np.int64]) – The (num_inputs, landmark_indices) image file indices to obtain the matches for

Returns:

2 sets of landmarks in shape (num_sides * batch_size, num_sides, 68, 2) stacked to a batch
of landmark points for augmentation

Return type:

npt.NDArray[np.float32]

Classes 

`BatchMeta`([mask_face, mask_eye, mask_mouth])	Dataclass that holds meta information required for training a batch of images
`Collate`(input_size, output_sizes, ...)	Collation function for processing a batch of samples into input and output tensors applying augmentation
`LandmarkMatcher`(folders, size, centering, ...)	Prepares landmarks when Warp-to-Landmarks is enabled.

Class Inheritance Diagram 

Inheritance diagram of lib.training.data.collate.BatchMeta, lib.training.data.collate.Collate, lib.training.data.collate.LandmarkMatcher

lib.training.data.data_set Module 

Handles Data loading and augmentation for feeding Faceswap Models

class lib.training.data.data_set.MultiDataset(datasets: tuple[_BaseSet, ...], is_random: bool = True)

Handles processing data for models with multiple inputs. The length is set as the largest dataset. Shuffling all datasets is handled internally at the end of each

Parameters:

datasets (tuple[_BaseSet, ...]) – The input specific datasets for feeding the model
is_random (bool) – True if data from each of the datasets should be read randomly. False if all datasets should return the item for the given index

shuffle() → None

Shuffle all of the contained dataset’s data

Return type:: None

class lib.training.data.data_set.PreviewSet(side: str, image_folder: str, input_size: int, output_size: int, color_order: Literal['bgr', 'rgb'], num_images: int = 0)

Preview dataset loader. The dataset loader is responsible for loading images from disk and preparing them for inference and display in the model preview

Parameters:

side (str) – The side of the model (“A”, “B” etc.)
image_folder (str) – Full path to a folder containing training images
input_size (int) – The input size to the model
output_size (int) – The largest output size of the model
color_order (T.Literal['bgr', 'rgb']) – The color order the model expects data in
num_images (int) – Set to 0 for random previews from the image folder. Set to a positive integer for this number of images to use for a static timelapse. Default: 0

class lib.training.data.data_set.TrainSet(side: str, image_folder: str, size: int)

Base class for Training and Preview dataset loaders to inherit from

Parameters:

side (str) – The side of the model (“A”, “B” etc.)
image_folder (str) – Full path to a folder containing training images
size (int) – The size to return samples at. This should be the maximum of the model input/output size for train sets or the model input size for preview sets

lib.training.data.data_set.get_label(index: int, num_identities: int, next_identity: bool = False) → str

Obtain the label for the given current index. Labels start at A at index 0. Values roll.

Parameters:

index (int) – The index of the current label
num_identities (int) – The number of identities that belong to the label set
next_identity (bool) – True to return the next identity for the given index. Default: False

Return type:

The current or next label. Labels go A-Z,0-9,a-z

lib.training.data.data_set.get_sorted_images(folder: str) → list[str]

For the given folder return the sorted list of potential training images

Parameters:: folder (str) – The folder containing faceswap training images
Return type:: The sorted list of full paths to the training images within the folder

lib.training.data.data_set.to_float32(in_array: npt.NDArray[np.uint8]) → npt.NDArray[np.float32]

Cast an UINT8 array in 0-255 range to float32 in 0.0-1.0 range.

Parameters:: in_array (npt.NDArray[np.uint8]) – The input uint8 array
Return type:: The array cast to 0.0 - 1.0 float32

Functions 

`get_label`(index, num_identities[, next_identity])	Obtain the label for the given current index.
`get_sorted_images`(folder)	For the given folder return the sorted list of potential training images
`to_float32`(in_array)	Cast an UINT8 array in 0-255 range to float32 in 0.0-1.0 range.

Classes 

`MultiDataset`(datasets[, is_random])	Handles processing data for models with multiple inputs.
`PreviewSet`(side, image_folder, input_size, ...)	Preview dataset loader.
`TrainSet`(side, image_folder, size)	Base class for Training and Preview dataset loaders to inherit from

Class Inheritance Diagram 

Inheritance diagram of lib.training.data.data_set.MultiDataset, lib.training.data.data_set.PreviewSet, lib.training.data.data_set.TrainSet

lib.training.data.loader Module 

Handles the loading of data for training and previews for faceswap models

class lib.training.data.loader.PreviewLoader(input_size: int, output_size: int, color_order: Literal['bgr', 'rgb'], input_folders: list[str], batch_size: int, sampler: None | type[RandomSampler | SequentialSampler] = None, num_samples: int = 0)

Generator for feeding faceswap models input data for generating preview images. Gets the next items from each of the configured loaders and collates them for feeding into a model

Parameters:

input_size (int) – The input size to the model
output_sizes – The output sizes to the model (list as some models have multi-scale outputs)
color_order (T.Literal['bgr', 'rgb']) – The color order of the model
input_folders (list[str]) – list of folders to read images from for each side being trained
batch_size (int) – The number of images being displayed in the preview
sampler (None | type[tch_data.RandomSampler | tch_data.SequentialSampler]) – The sampler to use for the data loaders. Default: None (RandomSampler)
num_samples (int) – Set to 0 for random previews from the image folder. Set to a positive integer for this number of images to use for a static timelapse. Default: 0
output_size (int)

__next__() → tuple[Tensor, Tensor]

Obtain the next batch of data for each side for feeding the model

Returns:

inputs – The inputs to the model for each side of the model. The array is returned in (side, batch_size, *dims) where side 0 is “A” and side 1 is “B” etc.
targets – The full sized source image with mask in 4th channel for each side of the model in format (side, batch_size, *dims, 4) where `side 0 is “A” and side 1 is “B” etc.

Return type:

tuple[Tensor, Tensor]

get_loader() → DataLoader

Obtain the dataloaders for each input/output for the model

Return type:: The Training data loaders in side order

class lib.training.data.loader.TrainLoader(input_size: int, output_sizes: tuple[int, ...], color_order: T.Literal['bgr', 'rgb'], config: TrainConfig, sampler: None | type[tch_data.RandomSampler | tch_data.DistributedSampler] = None)

Generator for feeding faceswap models with multiple inputs and outputs. Gets the next items from each of the configured loaders and collates them for feeding into a model

Parameters:

input_size (int) – The input size to the model
output_sizes (tuple[int, ...]) – The output sizes to the model (list as some models have multi-scale outputs)
color_order (T.Literal['bgr', 'rgb']) – The color order of the model
config (TrainConfig) – The training configuration for feeding the model
sampler (None | type[tch_data.RandomSampler | tch_data.DistributedSampler]) – The sampler to use for the data loaders. Default: None (RandomSampler)

__next__() → tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

Obtain the next outputs from the loader

Returns:

inputs – list of len (num_inputs) tensors of shape(batch_size, H, W, C) inputs for the model
targets – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range
meta – The meta information for the batch

Return type:

tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

get_loader() → DataLoader

Obtain the dataloaders for each input/output for the model

Return type:: The Training data loaders in side order

Classes 

`Collate`(input_size, output_sizes, ...)	Collation function for processing a batch of samples into input and output tensors applying augmentation
`DataLoader`(dataset[, batch_size, shuffle, ...])	Data loader combines a dataset and a sampler, and provides an iterable over the given dataset.
`LandmarkMatcher`(folders, size, centering, ...)	Prepares landmarks when Warp-to-Landmarks is enabled.
`MultiDataset`(datasets[, is_random])	Handles processing data for models with multiple inputs.
`PreviewLoader`(input_size, output_size, ...)	Generator for feeding faceswap models input data for generating preview images.
`PreviewSet`(side, image_folder, input_size, ...)	Preview dataset loader.
`TrainLoader`(input_size, output_sizes, ...[, ...])	Generator for feeding faceswap models with multiple inputs and outputs.
`TrainSet`(side, image_folder, size)	Base class for Training and Preview dataset loaders to inherit from

Variables 

`annotations`
`logger`	A standard `logging.logger` with additional "verbose" and "trace" levels added.