lib.training package

The training Package handles libraries to assist with training a model

lib.training.loss Module

Handles the collation, weighting masking and calculation of the selected Loss functions for training Faceswap models

class lib.training.loss.BatchLoss(unweighted: list[dict[str, Tensor]], weighted: list[dict[str, Tensor]], mask: Tensor | None = None)

Dataclass for holding Loss values for a batch of data

Parameters:
  • unweighted (list[dict[str, Tensor]])

  • weighted (list[dict[str, Tensor]])

  • mask (Tensor | None)

mask: Tensor | None = None

The loss scalar for the mask for each item in the batch if learn_mask is selected otherwise None. Default: None

to_cpu() Self

Detaches all contained loss values and moves them to CPU

Return type:

This object with all tensors detached and moved to CPU

property total: Tensor

The total single weighted loss scalar for all items in the batch for backprop

unweighted: list[dict[str, Tensor]] = <dataclasses._MISSING_TYPE object>

For each side output, the unweighted loss scalars for each function for each item in the batch

weighted: list[dict[str, Tensor]] = <dataclasses._MISSING_TYPE object>

For each side output, the weighted loss scalars for each function for each item in the batch

class lib.training.loss.LossCollator(functions: list[str], weights: list[float], color_order: Literal['bgr', 'rgb'], use_mask: bool, eye_multiplier: float, mouth_multiplier: float, smallest_output: int, mask_loss: str | None = None)

Compiles the chosen loss functions and calculates the values in the training loop

Parameters:
  • functions (list[str]) – List of lost function names from configuration file to collate for loss calculation

  • weights (list[float]) – List of weights, corresponding to the the list of functions, to apply to each loss function

  • color_order (T.Literal['bgr', 'rgb']) – The color order that the model is training in

  • use_mask (bool) – True if loss should be masked as penalize mask loss has been selected

  • eye_multiplier (float) – The amount of extra weighting to apply to the eye area

  • mouth_multiplier (float) – The amount of extra weighting to apply to the mouth area

  • smallest_output (int) – The smallest output from the model. Required for initializing some loss functions

  • mask_loss (str | None) – The loss function to use if learn_mask is enabled. Default: None (not enabled)

forward(y_true_all: list[torch.Tensor], y_pred_all: list[torch.Tensor], meta: BatchMeta) BatchLoss

Call the loss functions, reduce to batch dimension, apply masks and weighting and obtain the weighted and unweighted per function values and the weighted total loss scalar

Parameters:
  • y_true_all (list[torch.Tensor]) – The ground truth batch of images for all outputs for a side of the model

  • y_pred_all (list[torch.Tensor]) – The batch of model predictions for all outputs for a side of the model

  • meta (BatchMeta) – The meta information for the batch

Return type:

The loss scalars for the batch

Classes

BatchLoss(unweighted, weighted[, mask])

Dataclass for holding Loss values for a batch of data

LossCollator(functions, weights, ...[, ...])

Compiles the chosen loss functions and calculates the values in the training loop

Class Inheritance Diagram

Inheritance diagram of lib.training.loss.BatchLoss, lib.training.loss.LossCollator

lib.training.lr_finder Module

Learning Rate Finder for faceswap.py.

class lib.training.lr_finder.LRStrength(*values)

Enum for how aggressively to set the optimal learning rate

class lib.training.lr_finder.LearningRateFinder(trainer: train.Trainer, scheduler: ExponentialLR, steps: int, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit'], stop_factor: int = 4, beta: float = 0.98)

Learning Rate Finder

Parameters:
  • trainer (train.Trainer) – The training loop with the loaded training plugin

  • scheduler (ExponentialLR) – The LRFinder scheduler

  • steps (int) – The number of steps to run the finder for

  • strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate

  • mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

  • stop_factor (int) – When to stop finding the optimal learning rate

  • beta (float) – Amount to smooth loss by, for graphing purposes

property best_lr: None | float

The discovered best learning rate or None if not found

find() None

Find the optimal learning rate

Return type:

None

Classes

LRStrength(*values)

Enum for how aggressively to set the optimal learning rate

LearningRateFinder(trainer, scheduler, ...)

Learning Rate Finder

Class Inheritance Diagram

Inheritance diagram of lib.training.lr_finder.LRStrength, lib.training.lr_finder.LearningRateFinder

lib.training.lr_warmup Module

Handles Learning Rate Warmup when training a model

class lib.training.lr_warmup.WarmupScheduler(optimizer: Optimizer, steps: int, last_epoch: int = -1)

Handles the updating of the model’s learning rate during Learning Rate Warmup

Parameters:
  • optimizer (Optimizer) – The torch optimizer in use

  • steps (int) – The number of iterations to warmup the learning rate for

  • last_epoch (int) – The last step that was run (last_epoch is a misnomer inherited from PyTorch and actually refers to steps in our use case). Default: -1 (not yet started)

get_lr() list[float | Tensor]

Get the learning rate for the current step

Return type:

The next learning rate for each parameter group for the next step

step(epoch=None) None

If a learning rate update is required, update the model’s learning rate, otherwise do nothing

Parameters:

epoch – Deprecated argument from PyTorch that should always be None. Default: None

Return type:

None

steps

The total number of steps to warmup the LR for

Classes

WarmupScheduler(optimizer, steps[, last_epoch])

Handles the updating of the model's learning rate during Learning Rate Warmup


lib.training.optimizer Module

Wraps the selected Torch optimizer and handles optimizer related functions such as loss scaling, clipping and gradient accumulation

class lib.training.optimizer.GradClip(method: Literal['autoclip', 'global_norm', 'norm', 'value'], value: float, autoclip_history: int = 10000)

Handles the clipping of gradients based on user supplied parameters

Parameters:
  • method (T.Literal['autoclip', 'global_norm', 'norm', 'value']) – The clipping method to use

  • value (float) – The clipping value to use. For autoclip this is the percentile to clip at (a value of 1.0 will clip at the 10th percentile a value of 2.5 will clip at the 25th percentile etc)

  • autoclip_history (int) – The history length for auto clipping. Default: 10000

__call__(parameters: list[Parameter]) None

Clip the given parameters by the chosen method

Parameters:

parameters (list[Parameter]) – The parameters to clip

Return type:

None

class lib.training.optimizer.Optimizer(model: Model, config: type[OptConfig], mixed_precision: bool = False, warmup_steps: int = 0)

Object for managing the selected Torch optimizer

Parameters:
  • model (Model) – The model that is to be trained

  • config (type[OptConfig]) – The optimizer user configuration options

  • mixed_precision (bool) – True to train using mixed precision. Default: False

  • warmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0

backward(loss: Tensor) None

Perform the optimizer’s backward pass

Parameters:

loss (Tensor) – The loss scalar from the forward pass

Return type:

None

find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) bool

Use the Learning Rate Finder to discover the optimal learning rate

Parameters:
  • trainer (Trainer) – The training loop with the loaded training plugin

  • steps (int) – The number of iterations to run the learning rate finder for

  • start_lr (float) – The learning rate to start scanning from

  • end_lr (float) – The final learning rate to scan until

  • strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate

  • mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

Return type:

True if an optimal learning rate was discovered.

load_state_dict(state_dict: dict[str, Any]) None

Load the serialized data from a state dict into this object

Parameters:

state_dict (dict[str, Any]) – The serialized data to load

Return type:

None

set_lr(lr: float) None

Manually assign the optimizer’s learning rate with the given value

Parameters:

lr (float) – The learning rate to apply to the optimizer

Return type:

None

state_dict() dict[str, Any]

Serialized data as a dict for relevant options contained in this class

Return type:

The serialized data for this object for saving and loading

step() None

Perform the optimizer step if valid and zero the gradients.

Handles gradient accumulation, scaling for mixed precision and gradient clipping

Return type:

None

to(device: torch.Device) None

Place the optimizer onto the given device

Parameters:

device (torch.Device) – The device to place the optimizer on to

Return type:

None

lib.training.optimizer.get_parameter_group_ids(trainable_variables: list[Variable]) dict[int, T.Literal['decay', 'no_decay']]

Obtain the index of each item in the keras model’s trainable weights that belong to each of the optimizer’s parameter groups (ie split by weights that take decay and don’t take decay)

Parameters:

trainable_variables (list[Variable]) – list of trainable variables from keras model

Return type:

dictionary of keras model’s trainable weight index to the name of the parameter group

Functions

get_parameter_group_ids(trainable_variables)

Obtain the index of each item in the keras model's trainable weights that belong to each of the optimizer's parameter groups (ie split by weights that take decay and don't take decay)

Classes

GradClip(method, value[, autoclip_history])

Handles the clipping of gradients based on user supplied parameters

Optimizer(model, config[, mixed_precision, ...])

Object for managing the selected Torch optimizer


lib.training.preview Module

Handles the creation of display images for preview window and timelapses

class lib.training.preview.Samples(coverage_ratio: float, has_mask: bool, mask_opacity: int, mask_color: str)

Compile samples for display for preview and time-lapse

Parameters:
  • coverage_ratio (float) – Ratio of face to be cropped out of the training image.

  • has_mask (bool) – True if the model was trained with a mask

  • mask_opacity (int) – The opacity (as a percentage) to use for the mask overlay

  • mask_color (str) – The hex RGB value to use the mask overlay

get_preview(predictions: npt.NDArray[np.float32], targets: npt.NDArray[np.float32]) npt.NDArray[np.uint8]

Compile a preview image.

Predictions

The (BGR) predictions shape: (src_side, dst_side, batch_size, height, width, channels)

targets

Full size BGR face patches at 100% coverage for patching predictions into in (A, B, …) order

Return type:

A compiled preview image ready for display or saving

Parameters:
  • predictions (npt.NDArray[np.float32])

  • targets (npt.NDArray[np.float32])

toggle_mask_display() None

Toggle the mask overlay on or off depending on user input.

Return type:

None

Classes

Samples(coverage_ratio, has_mask, ...)

Compile samples for display for preview and time-lapse


lib.training.preview_cv Module

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_cv.PreviewBase(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], Event] | None = None)

Parent class for OpenCV and Tkinter Preview Windows

Parameters:
  • preview_buffer (PreviewBuffer) – The thread safe object holding the preview images

  • triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

class lib.training.preview_cv.PreviewBuffer

A thread safe class for holding preview images

add_image(name: str, image: np.ndarray) None

Add an image to the preview buffer in a thread safe way

Parameters:
  • name (str)

  • image (np.ndarray)

Return type:

None

get_images() Generator[tuple[str, np.ndarray], None, None]

Get the latest images from the preview buffer. When iterator is exhausted clears the updated event.

Yields:
  • name (str) – The name of the image

  • numpy.ndarray – The image in BGR format

Return type:

Generator[tuple[str, np.ndarray], None, None]

property is_updated: bool

True when new images have been loaded into the preview buffer

Type:

bool

class lib.training.preview_cv.PreviewCV(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], Event])

Simple fall back preview viewer using OpenCV for when TKinter is not available

Parameters:
  • preview_buffer (PreviewBuffer) – The thread safe object holding the preview images

  • triggers (dict) – Dictionary of event triggers for pop-up preview.

Classes

PreviewBase(preview_buffer[, triggers])

Parent class for OpenCV and Tkinter Preview Windows

PreviewBuffer()

A thread safe class for holding preview images

PreviewCV(preview_buffer, triggers)

Simple fall back preview viewer using OpenCV for when TKinter is not available

Class Inheritance Diagram

Inheritance diagram of lib.training.preview_cv.PreviewBase, lib.training.preview_cv.PreviewBuffer, lib.training.preview_cv.PreviewCV

lib.training.preview_tk Module

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_tk.PreviewTk(preview_buffer: PreviewBuffer, parent: tk.Widget | None = None, taskbar: ttk.Frame | None = None, triggers: TriggerType | None = None)

Holds a preview window for displaying the pop out preview.

Parameters:
  • preview_buffer (PreviewBuffer) – The thread safe object holding the preview images

  • parent (tk.Widget | None) – If this viewer is being called from the GUI the parent widget should be passed in here. If this is a standalone pop-up window then pass None. Default: None

  • taskbar (ttk.Frame | None) – If this viewer is being called from the GUI the parent’s option frame should be passed in here. If this is a standalone pop-up window then pass None. Default: None

  • triggers (TriggerType | None) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

property master_frame: Frame

The master frame that holds the preview window

pack(*args, **kwargs)

Redirect calls to pack the widget to pack the actual _master_frame.

Takes standard tkinter.Frame pack arguments

remove_option_controls() None

Remove the taskbar options controls when the preview is disabled in the GUI

Return type:

None

save(location: str) None

Save action to be performed when save button pressed from the GUI.

Parameters:

location (str) – Full path to the folder to save the preview image to

Return type:

None

lib.training.preview_tk.main()

Load image from first given argument and display

python -m lib.training.preview_tk <filename>

Functions

main()

Load image from first given argument and display

Classes

PreviewTk(preview_buffer[, parent, taskbar, ...])

Holds a preview window for displaying the pop out preview.

Class Inheritance Diagram

Inheritance diagram of lib.training.preview_tk.PreviewTk

lib.training.tensorboard Module

Tensorboard call back for PyTorch logging. Hopefully temporary until a native Keras version is implemented

class lib.training.tensorboard.RecordIterator(log_file, is_live: bool = False)

A replacement for tensorflow’s compat.v1.io.tf_record_iterator()

Parameters:
  • log_file – The event log file to obtain records from

  • is_live (bool) – True if the log file is for a live training session that will constantly provide data. Default: False

__next__() bytes

Get the next event log from a Tensorboard event file

Return type:

A Tensorboard event log

Raises:

StopIteration – When the event log is fully consumed

class lib.training.tensorboard.TorchTensorBoard(log_dir: str = 'logs', write_graph: bool = True, update_freq: Literal['batch', 'epoch'] | int = 'epoch')

Enable visualizations for TensorBoard. Adapted from Keras’ Tensorboard Callback keeping only the parts we need, and using Torch rather than TensorFlow

Parameters:
  • log_dir (str) – The path of the directory where to save the log files to be parsed by TensorBoard. e.g., log_dir = os.path.join(working_dir, ‘logs’). This directory should not be reused by any other callbacks.

  • write_graph (bool) – Whether to visualize the graph in TensorBoard. Note that the log file can become quite large when write_graph is set to True. Note: Not supported at this time

  • update_freq (T.Literal['batch', 'epoch'] | int) – When using “epoch”, writes the losses and metrics to TensorBoard after every epoch. If using an integer, let’s say 1000, all metrics and losses (including custom ones added by Model.compile) will be logged to TensorBoard every 1000 batches. “batch” is a synonym for 1, meaning that they will be written every batch. Note however that writing too frequently to TensorBoard can slow down your training, especially when used with distribution strategies as it will incur additional synchronization overhead. Batch- level summary writing is also available via train_step override. Please see [TensorBoard Scalars tutorial](https://www.tensorflow.org/tensorboard/scalars_and_keras#batch-level_logging)

on_save() None

Flush data to disk on save

Return type:

None

on_train_batch_end(batch: int, logs: dict[str, float | dict[str, float]] | None = None) None

Update Tensorboard logs on batch end

Parameters:
  • batch (int) – The current iteration count

  • logs (dict[str, float | dict[str, float]] | None) – The logs to write

Return type:

None

on_train_begin(logs=None) None

Initialize the call back on train start

Parameters:

logs – Unused

Return type:

None

on_train_end(logs=None) None

Close the writer on train completion

Parameters:

logs – Unused

Return type:

None

set_model(model: Model) None

Sets Keras model and writes graph if specified.

Parameters:

model (Model) – The model that is being trained

Return type:

None

Classes

RecordIterator(log_file[, is_live])

A replacement for tensorflow's compat.v1.io.tf_record_iterator()

TorchTensorBoard([log_dir, write_graph, ...])

Enable visualizations for TensorBoard.

Class Inheritance Diagram

Inheritance diagram of lib.training.tensorboard.RecordIterator, lib.training.tensorboard.TorchTensorBoard

lib.training.train Module

Run the training loop for a training plugin

class lib.training.train.Trainer(plugin: TrainerBase, preview: bool, warmup_steps: int = 0, timelapse_folders: list[str] | None = None, timelapse_output: str = '')

Handles the feeding of training images to Faceswap models, the generation of Tensorboard logs and the creation of sample/time-lapse preview images.

All Trainer plugins must inherit from this class.

Parameters:
  • plugin (TrainerBase) – The plugin that will be processing each batch

  • preview (bool) – True to generate previews

  • warmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0

  • timelapse_folders (list[str] | None) – The input folders to create timelapse images from. Default: None (no timelapse)

  • timelapse_output (str) – The folder to output timelapse images. Default: “” (no timelapse)

property exit_early: bool

True if the trainer should exit early, without performing any training steps

save(is_exit: bool = False) None

Save the model

Parameters:

is_exit (bool) – True if save has been called on model exit. Default: False

Return type:

None

toggle_mask() None

Toggle the mask overlay on or off based on user input.

Return type:

None

train_one_batch() list[BatchLoss]

Process a single batch through the model and obtain the loss

Return type:

The collated loss values detached and moved to CPU in order (A, B, …)

train_one_step(viewer: Callable[[np.ndarray, str], None] | None, do_timelapse: bool = False) None

Running training on a batch of images for each side.

Triggered from the training cycle in scripts.train.Train.

  • Runs a training batch through the model.

  • Outputs the iteration’s loss values to the console

  • Logs loss to Tensorboard, if logging is requested.

  • If a preview or time-lapse has been requested, then pushes sample images through the model to generate the previews

  • Creates a snapshot if the total iterations trained so far meet the requested snapshot criteria

Notes

As every iteration is called explicitly, the Parameters defined should always be None except on save iterations.

Parameters:
  • viewer (Callable[[np.ndarray, str], None] | None) – The function that will display the preview image

  • do_timelapse (bool) – True to generate a timelapse preview image

Return type:

None

Classes

Trainer(plugin, preview[, warmup_steps, ...])

Handles the feeding of training images to Faceswap models, the generation of Tensorboard logs and the creation of sample/time-lapse preview images.

Class Inheritance Diagram

Inheritance diagram of lib.training.train.Trainer

data package

lib.training.data.augmentation Module

Processes the augmentation of images for feeding into a Faceswap model.

class lib.training.data.augmentation.ConstantsAugmentation(color: ConstantsColor, transform: ConstantsTransform, warp: ConstantsWarp)

Dataclass for holding constants for Image Augmentation.

Parameters:
color

The constants for adjusting color/contrast in an image

Type:

lib.training.data.augmentation.ConstantsColor

transform

The constants for image transformation

Type:

lib.training.data.augmentation.ConstantsTransform

warp

The constants for image warping

Type:

lib.training.data.augmentation.ConstantsWarp

Dataclass should be initialized using its :func:`from_config` method

Example

>>> constants = ConstantsAugmentation.from_config(processing_size=256,
...                                               batch_size=16)
color: ConstantsColor = <dataclasses._MISSING_TYPE object>

The constants for adjusting color/contrast in an image

classmethod from_config(processing_size: int, batch_size: int) ConstantsAugmentation

Create a new dataclass instance from user config

Parameters:
  • processing_size (int) – The size of image to augment the data for

  • batch_size (int) – The batch size that augmented data is being prepared for

Return type:

ConstantsAugmentation

transform: ConstantsTransform = <dataclasses._MISSING_TYPE object>

The constants for image transformation

warp: ConstantsWarp = <dataclasses._MISSING_TYPE object>

The constants for image warping

class lib.training.data.augmentation.ConstantsColor(clahe_base_contrast: int, clahe_chance: float, clahe_max_size: int, lab_adjust: ndarray)

Dataclass for holding constants for enhancing an image (ie contrast/color adjustment)

Parameters:
  • clahe_base_contrast (int) – The base number for Contrast Limited Adaptive Histogram Equalization

  • clahe_chance (float) – Probability to perform Contrast Limited Adaptive Histogram Equalization

  • clahe_max_size (int) – Maximum clahe window size

  • lab_adjust (numpy.ndarray) – Adjustment amounts for L*A*B augmentation

clahe_base_contrast: int = <dataclasses._MISSING_TYPE object>

The base number for Contrast Limited Adaptive Histogram Equalization

clahe_chance: float = <dataclasses._MISSING_TYPE object>

Probability to perform Contrast Limited Adaptive Histogram Equalization

clahe_max_size: int = <dataclasses._MISSING_TYPE object>

Maximum clahe window size

lab_adjust: ndarray = <dataclasses._MISSING_TYPE object>

Adjustment amounts for L*A*B augmentation

class lib.training.data.augmentation.ConstantsTransform(rotation: int, zoom: float, shift: float, flip: float)

Dataclass for holding constants for transforming an image

Parameters:
  • rotation (int) – Rotation range for transformations

  • zoom (float) – Zoom range for transformations

  • shift (float) – Shift range for transformations

  • flip (float)

flip: float = <dataclasses._MISSING_TYPE object>

The chance to flip an image

rotation: int = <dataclasses._MISSING_TYPE object>

Rotation range for transformations

shift: float = <dataclasses._MISSING_TYPE object>

Shift range for transformations

zoom: float = <dataclasses._MISSING_TYPE object>

Zoom range for transformations

class lib.training.data.augmentation.ConstantsWarp(maps: ndarray, pad: tuple[int, int], slices: slice, scale: float, lm_edge_anchors: ndarray, lm_grids: ndarray, lm_scale: float)

Dataclass for holding constants for warping an image

Parameters:
  • maps (numpy.ndarray) – The stacked (x, y) mappings for image warping

  • pad (tuple[int, int]) – The padding to apply for image warping

  • slices (slice) – The slices for extracting a warped image

  • lm_edge_anchors (numpy.ndarray) – The edge anchors for landmark based warping

  • lm_grids (numpy.ndarray) – The grids for landmark based warping

  • scale (float)

  • lm_scale (float)

lm_edge_anchors: ndarray = <dataclasses._MISSING_TYPE object>

The edge anchors for landmark based warping

lm_grids: ndarray = <dataclasses._MISSING_TYPE object>

The grids for landmark based warping

lm_scale: float = <dataclasses._MISSING_TYPE object>

The scaling to apply to landmark based warping

maps: ndarray = <dataclasses._MISSING_TYPE object>

The stacked (x, y) mappings for image warping

pad: tuple[int, int] = <dataclasses._MISSING_TYPE object>

The padding to apply for image warping

scale: float = <dataclasses._MISSING_TYPE object>

The scaling to apply to standard warping

slices: slice = <dataclasses._MISSING_TYPE object>

The slices for extracting a warped image

class lib.training.data.augmentation.ImageAugmentation(batch_size: int, processing_size: int)

Performs augmentation on batches of training images.

Parameters:
  • batch_size (int) – The number of images that will be fed through the augmentation functions at once.

  • processing_size (int) – The largest input or output size of the model. This is the size that images are processed at.

color_adjust(batch: ndarray) ndarray

Perform color augmentation on the passed in batch.

The color adjustment parameters are set in config.train.ini

Parameters:

batch (ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format of uint8 dtype.

Return type:

A 4-dimensional array of the same shape as batch with color augmentation applied.

random_flip(batch: npt.NDArray[np.uint8], points: npt.NDArray[np.float32] | None) None

Perform random horizontal flipping on the passed in batch.

The probability of flipping an image is set in config.train.ini

Parameters:
  • batch (npt.NDArray[np.uint8]) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.

  • points (npt.NDArray[np.float32] | None) – Any (x, y) points to transform. Can be in any shape but the final dimension should be shape 2. None if there are no points to transform

Return type:

None

transform(batch: npt.NDArray[np.uint8], points: npt.NDArray[np.float32] | None) None

Perform random transformation on the passed in batch and optional (x, y) points.

The transformation parameters are set in config.train.ini

Parameters:
  • batch (npt.NDArray[np.uint8]) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.

  • points (npt.NDArray[np.float32] | None) – Any (x, y) points to transform. in shape (batch_size, num_sides, 68, 2). None if there are no points to transform

Return type:

None

warp(batch: ndarray, to_landmarks: bool = False, batch_src_points: ndarray | None = None, batch_dst_points: ndarray | None = None) ndarray

Perform random warping on the passed in batch by one of two methods.

Parameters:
  • batch (ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.

  • to_landmarks (bool) – If False perform standard random warping of the input image. If True perform warping to semi-random similar corresponding landmarks from the other side. Default: False

  • batch_src_points (ndarray | None) – Only used when to_landmarks is True. A batch of 68 point landmarks for the source faces. This is a 3-dimensional array in the shape (batchsize, 68, 2). Default: None

  • batch_dst_points (ndarray | None) – Only used when to_landmarks is True. A batch of randomly chosen closest match destination faces landmarks. This is a 3-dimensional array in the shape (batchsize, 68, 2). Default None

Return type:

A 4-dimensional array of the same shape as batch with warping applied.

Classes

ConstantsAugmentation(color, transform, warp)

Dataclass for holding constants for Image Augmentation.

ConstantsColor(clahe_base_contrast, ...)

Dataclass for holding constants for enhancing an image (ie contrast/color adjustment)

ConstantsTransform(rotation, zoom, shift, flip)

Dataclass for holding constants for transforming an image

ConstantsWarp(maps, pad, slices, scale, ...)

Dataclass for holding constants for warping an image

ImageAugmentation(batch_size, processing_size)

Performs augmentation on batches of training images.

Class Inheritance Diagram

Inheritance diagram of lib.training.data.augmentation.ConstantsAugmentation, lib.training.data.augmentation.ConstantsColor, lib.training.data.augmentation.ConstantsTransform, lib.training.data.augmentation.ConstantsWarp, lib.training.data.augmentation.ImageAugmentation

lib.training.data.collate Module

Handles collation of data for training faceswap models

class lib.training.data.collate.BatchMeta(mask_face: list[Tensor] | None = None, mask_eye: list[Tensor] | None = None, mask_mouth: list[Tensor] | None = None)

Dataclass that holds meta information required for training a batch of images

All lists are of len(number model outputs per side) with tensors in shape (batch_size, num_inputs, 1, H, W)

Parameters:
  • mask_face (list[Tensor] | None)

  • mask_eye (list[Tensor] | None)

  • mask_mouth (list[Tensor] | None)

mask_eye: list[Tensor] | None = None

The eye mask if eye loss multipliers > 1 for each output in NCHW order

mask_face: list[Tensor] | None = None

The selected face mask for penalized loss/learn mask for each output in NCHW order

mask_mouth: list[Tensor] | None = None

The mouth mask if mouth loss multipliers > 1 for each output in NCHW order

to(device: str | torch.Device) T.Self

Place all contained tensors onto the given device

Parameters:

device (str | torch.Device) – The device to place the tensors on to

Return type:

This object with the tensors placed on the requested device

class lib.training.data.collate.Collate(input_size: int, output_sizes: tuple[int, ...], color_order: T.Literal['bgr', 'rgb'], config: TrainConfig, landmarks: LandmarkMatcher | None)

Collation function for processing a batch of samples into input and output tensors applying augmentation

Parameters:
  • input_size (int) – The pixel size of the model input

  • output_sizes (tuple[int, ...]) – The pixel sizes of the model output

  • color_order (T.Literal['bgr', 'rgb']) – The color order that the model expects

  • config (TrainConfig) – The training configuration for the model

  • landmarks (LandmarkMatcher | None) – The landmark matching object for the (A and B) sides of the model if warp_to_landmarks is enabled otherwise None

__call__(data: list[tuple[tuple[npt.NDArray[np.uint8], int], ...]]) tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

Prepare the loaded samples for feeding the model, creating targets and applying augmentation

Parameters:

data (list[tuple[tuple[npt.NDArray[np.uint8], int], ...]]) – Batch of data tuples with the loaded stacked image and masks from each loader in the first position and the image file index for each item in the batch in the 2nd

Returns:

  • feed – list of len (num_inputs) tensors of shape(batch_size, H, W, C) inputs for the model

  • targets – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range

  • meta – The meta information for the batch

Return type:

tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

class lib.training.data.collate.LandmarkMatcher(folders: list[str], size: int, centering: CenteringType, coverage: float, y_offset: float, num_choices: int = 10)

Prepares landmarks when Warp-to-Landmarks is enabled.

2 sides (A/B) only.

For each side, stores the aligned landmarks for each side and collates the 10 nearest matches on the other side for random warping

Parameters:
  • folders (list[str]) – Two training folders for sides A and B

  • size (int) – The aligned face size to transform the landmarks to

  • centering (CenteringType) – The aligned centering to transform the landmarks to

  • coverage (float) – Additional coverage ratio to be applied

  • y_offset (float) – Additional vertical offset to be applied

  • num_choices (int) – Number of choices from the opposite side to cache for each landmark. Default: 10

get_close_landmarks(indices: npt.NDArray[np.int64]) npt.NDArray[np.float32]

For the given image indices, obtain a randomly selected close match landmarks from the other side

Parameters:

indices (npt.NDArray[np.int64]) – The (num_inputs, landmark_indices) image file indices to obtain the matches for

Returns:

  • 2 sets of landmarks in shape (num_sides * batch_size, num_sides, 68, 2) stacked to a batch

  • of landmark points for augmentation

Return type:

npt.NDArray[np.float32]

Classes

BatchMeta([mask_face, mask_eye, mask_mouth])

Dataclass that holds meta information required for training a batch of images

Collate(input_size, output_sizes, ...)

Collation function for processing a batch of samples into input and output tensors applying augmentation

LandmarkMatcher(folders, size, centering, ...)

Prepares landmarks when Warp-to-Landmarks is enabled.

Class Inheritance Diagram

Inheritance diagram of lib.training.data.collate.BatchMeta, lib.training.data.collate.Collate, lib.training.data.collate.LandmarkMatcher

lib.training.data.data_set Module

Handles Data loading and augmentation for feeding Faceswap Models

class lib.training.data.data_set.MultiDataset(datasets: tuple[_BaseSet, ...], is_random: bool = True)

Handles processing data for models with multiple inputs. The length is set as the largest dataset. Shuffling all datasets is handled internally at the end of each

Parameters:
  • datasets (tuple[_BaseSet, ...]) – The input specific datasets for feeding the model

  • is_random (bool) – True if data from each of the datasets should be read randomly. False if all datasets should return the item for the given index

shuffle() None

Shuffle all of the contained dataset’s data

Return type:

None

class lib.training.data.data_set.PreviewSet(side: str, image_folder: str, input_size: int, output_size: int, color_order: Literal['bgr', 'rgb'], num_images: int = 0)

Preview dataset loader. The dataset loader is responsible for loading images from disk and preparing them for inference and display in the model preview

Parameters:
  • side (str) – The side of the model (“A”, “B” etc.)

  • image_folder (str) – Full path to a folder containing training images

  • input_size (int) – The input size to the model

  • output_size (int) – The largest output size of the model

  • color_order (T.Literal['bgr', 'rgb']) – The color order the model expects data in

  • num_images (int) – Set to 0 for random previews from the image folder. Set to a positive integer for this number of images to use for a static timelapse. Default: 0

class lib.training.data.data_set.TrainSet(side: str, image_folder: str, size: int)

Base class for Training and Preview dataset loaders to inherit from

Parameters:
  • side (str) – The side of the model (“A”, “B” etc.)

  • image_folder (str) – Full path to a folder containing training images

  • size (int) – The size to return samples at. This should be the maximum of the model input/output size for train sets or the model input size for preview sets

lib.training.data.data_set.get_label(index: int, num_identities: int, next_identity: bool = False) str

Obtain the label for the given current index. Labels start at A at index 0. Values roll.

Parameters:
  • index (int) – The index of the current label

  • num_identities (int) – The number of identities that belong to the label set

  • next_identity (bool) – True to return the next identity for the given index. Default: False

Return type:

The current or next label. Labels go A-Z,0-9,a-z

lib.training.data.data_set.get_sorted_images(folder: str) list[str]

For the given folder return the sorted list of potential training images

Parameters:

folder (str) – The folder containing faceswap training images

Return type:

The sorted list of full paths to the training images within the folder

lib.training.data.data_set.to_float32(in_array: npt.NDArray[np.uint8]) npt.NDArray[np.float32]

Cast an UINT8 array in 0-255 range to float32 in 0.0-1.0 range.

Parameters:

in_array (npt.NDArray[np.uint8]) – The input uint8 array

Return type:

The array cast to 0.0 - 1.0 float32

Functions

get_label(index, num_identities[, next_identity])

Obtain the label for the given current index.

get_sorted_images(folder)

For the given folder return the sorted list of potential training images

to_float32(in_array)

Cast an UINT8 array in 0-255 range to float32 in 0.0-1.0 range.

Classes

MultiDataset(datasets[, is_random])

Handles processing data for models with multiple inputs.

PreviewSet(side, image_folder, input_size, ...)

Preview dataset loader.

TrainSet(side, image_folder, size)

Base class for Training and Preview dataset loaders to inherit from

Class Inheritance Diagram

Inheritance diagram of lib.training.data.data_set.MultiDataset, lib.training.data.data_set.PreviewSet, lib.training.data.data_set.TrainSet

lib.training.data.loader Module

Handles the loading of data for training and previews for faceswap models

class lib.training.data.loader.PreviewLoader(input_size: int, output_size: int, color_order: Literal['bgr', 'rgb'], input_folders: list[str], batch_size: int, sampler: None | type[RandomSampler | SequentialSampler] = None, num_samples: int = 0)

Generator for feeding faceswap models input data for generating preview images. Gets the next items from each of the configured loaders and collates them for feeding into a model

Parameters:
  • input_size (int) – The input size to the model

  • output_sizes – The output sizes to the model (list as some models have multi-scale outputs)

  • color_order (T.Literal['bgr', 'rgb']) – The color order of the model

  • input_folders (list[str]) – list of folders to read images from for each side being trained

  • batch_size (int) – The number of images being displayed in the preview

  • sampler (None | type[tch_data.RandomSampler | tch_data.SequentialSampler]) – The sampler to use for the data loaders. Default: None (RandomSampler)

  • num_samples (int) – Set to 0 for random previews from the image folder. Set to a positive integer for this number of images to use for a static timelapse. Default: 0

  • output_size (int)

__next__() tuple[Tensor, Tensor]

Obtain the next batch of data for each side for feeding the model

Returns:

  • inputs – The inputs to the model for each side of the model. The array is returned in (side, batch_size, *dims) where side 0 is “A” and side 1 is “B” etc.

  • targets – The full sized source image with mask in 4th channel for each side of the model in format (side, batch_size, *dims, 4) where `side 0 is “A” and side 1 is “B” etc.

Return type:

tuple[Tensor, Tensor]

get_loader() DataLoader

Obtain the dataloaders for each input/output for the model

Return type:

The Training data loaders in side order

class lib.training.data.loader.TrainLoader(input_size: int, output_sizes: tuple[int, ...], color_order: T.Literal['bgr', 'rgb'], config: TrainConfig, sampler: None | type[tch_data.RandomSampler | tch_data.DistributedSampler] = None)

Generator for feeding faceswap models with multiple inputs and outputs. Gets the next items from each of the configured loaders and collates them for feeding into a model

Parameters:
  • input_size (int) – The input size to the model

  • output_sizes (tuple[int, ...]) – The output sizes to the model (list as some models have multi-scale outputs)

  • color_order (T.Literal['bgr', 'rgb']) – The color order of the model

  • config (TrainConfig) – The training configuration for feeding the model

  • sampler (None | type[tch_data.RandomSampler | tch_data.DistributedSampler]) – The sampler to use for the data loaders. Default: None (RandomSampler)

__next__() tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

Obtain the next outputs from the loader

Returns:

  • inputs – list of len (num_inputs) tensors of shape(batch_size, H, W, C) inputs for the model

  • targets – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range

  • meta – The meta information for the batch

Return type:

tuple[list[torch.Tensor], list[torch.Tensor], BatchMeta]

get_loader() DataLoader

Obtain the dataloaders for each input/output for the model

Return type:

The Training data loaders in side order

Classes

Collate(input_size, output_sizes, ...)

Collation function for processing a batch of samples into input and output tensors applying augmentation

DataLoader(dataset[, batch_size, shuffle, ...])

Data loader combines a dataset and a sampler, and provides an iterable over the given dataset.

LandmarkMatcher(folders, size, centering, ...)

Prepares landmarks when Warp-to-Landmarks is enabled.

MultiDataset(datasets[, is_random])

Handles processing data for models with multiple inputs.

PreviewLoader(input_size, output_size, ...)

Generator for feeding faceswap models input data for generating preview images.

PreviewSet(side, image_folder, input_size, ...)

Preview dataset loader.

TrainLoader(input_size, output_sizes, ...[, ...])

Generator for feeding faceswap models with multiple inputs and outputs.

TrainSet(side, image_folder, size)

Base class for Training and Preview dataset loaders to inherit from

Variables

annotations

logger

A standard logging.logger with additional "verbose" and "trace" levels added.

Class Inheritance Diagram

Inheritance diagram of lib.training.data.loader.PreviewLoader, lib.training.data.loader.TrainLoader