training package

The training Package handles the processing of faces for feeding into a Faceswap model.

training.augmentation module

Processes the augmentation of images for feeding into a Faceswap model.

class lib.training.augmentation.AugConstants(config: dict[str, ConfigValueType], processing_size: int, batch_size: int)

Bases: object

Dataclass for holding constants for Image Augmentation.

Parameters:
  • config (dict[str, ConfigValueType]) – The user training configuration options

  • pricessing_size (int:) – The size of image to augment the data for

  • batch_size (int) – The batch size that augmented data is being prepared for

clahe_base_contrast: int

The base number for Contrast Limited Adaptive Histogram Equalization

Type:

int

clahe_chance: float

Probability to perform Contrast Limited Adaptive Histogram Equilization

Type:

float

clahe_max_size: int

Maximum clahe window size

Type:

int

lab_adjust: ndarray

Adjustment amounts for L*A*B augmentation

Type:

numpy.ndarray

transform_rotation: int

Rotation range for transformations

Type:

int

transform_shift: float

Shift range for transformations

Type:

float

transform_zoom: float

Zoom range for transformations

Type:

float

warp_lm_edge_anchors: ndarray

The edge anchors for landmark based warping

Type:

numpy.ndarray

warp_lm_grids: ndarray

The grids for landmark based warping

Type:

numpy.ndarray

warp_maps: ndarray

:class:`numpy.ndarray`The stacked (x, y) mappings for image warping

warp_pad: tuple[int, int]

The padding to apply for image warping

Type:

tuple[int, int]

warp_slices: slice

The slices for extracting a warped image

Type:

slice

class lib.training.augmentation.ImageAugmentation(batch_size: int, processing_size: int, config: dict[str, ConfigValueType])

Bases: object

Performs augmentation on batches of training images.

Parameters:
  • batch_size (int) – The number of images that will be fed through the augmentation functions at once.

  • processing_size (int) – The largest input or output size of the model. This is the size that images are processed at.

  • config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.

color_adjust(batch: ndarray) ndarray

Perform color augmentation on the passed in batch.

The color adjustment parameters are set in config.train.ini

Parameters:

batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.

Returns:

A 4-dimensional array of the same shape as batch with color augmentation applied.

Return type:

numpy.ndarray

random_flip(batch: ndarray)

Perform random horizontal flipping on the passed in batch.

The probability of flipping an image is set in config.train.ini

Parameters:

batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.

transform(batch: ndarray)

Perform random transformation on the passed in batch.

The transformation parameters are set in config.train.ini

Parameters:

batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.

warp(batch: ndarray, to_landmarks: bool = False, **kwargs) ndarray

Perform random warping on the passed in batch by one of two methods.

Parameters:
  • batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.

  • to_landmarks (bool, optional) – If False perform standard random warping of the input image. If True perform warping to semi-random similar corresponding landmarks from the other side. Default: False

  • kwargs (dict) –

    If to_landmarks is True the following additional kwargs must be passed in:

    • batch_src_points (numpy.ndarray) - A batch of 68 point landmarks for the source faces. This is a 3-dimensional array in the shape (batchsize, 68, 2).

    • batch_dst_points (numpy.ndarray) - A batch of randomly chosen closest match destination faces landmarks. This is a 3-dimensional array in the shape (batchsize, 68, 2).

Returns:

A 4-dimensional array of the same shape as batch with warping applied.

Return type:

numpy.ndarray

training.cache module

Holds the data cache for training data generators

class lib.training.cache.RingBuffer(batch_size: int, image_shape: tuple[int, int, int], buffer_size: int = 2, dtype: str = 'uint8')

Bases: object

Rolling buffer for holding training/preview batches

Parameters:
  • batch_size (int) – The batch size to create the buffer for

  • image_shape (tuple) – The height/width/channels shape of a single image in the batch

  • buffer_size (int, optional) – The number of arrays to hold in the rolling buffer. Default: 2

  • dtype (str, optional) – The datatype to create the buffer as. Default: “uint8”

lib.training.cache.get_cache(side: T.Literal['a', 'b'], filenames: list[str] | None = None, config: dict[str, ConfigValueType] | None = None, size: int | None = None, coverage_ratio: float | None = None) _Cache

Obtain a _Cache object for the given side. If the object does not pre-exist then create it.

Parameters:
  • side (str) – “a” or “b”. The side of the model to obtain the cache for

  • filenames (list) – The filenames of all the images. This can either be the full path or the base name. If the full paths are passed in, they are stripped to base name for use as the cache key. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None

  • config (dict, optional) – The user selected training configuration options. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None

  • size (int, optional) – The largest output size of the model. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None

  • coverage_ratio (float: optional) – The coverage ratio that the model is using. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None

Returns:

The face meta information cache for the requested side

Return type:

_Cache

training.generator module

Handles Data Augmentation for feeding Faceswap Models

class lib.training.generator.DataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)

Bases: object

Parent class for Training and Preview Data Generators.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles augmented data, target data and sample data.

Parameters:
  • model (ModelBase) – The model that this data generator is feeding

  • config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.

  • side ({'a' or 'b'}) – The side of the model that this iterator is for.

  • images (list) – A list of image paths that will be used to compile the final augmented data from.

  • batch_size (int) – The batch size for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.

minibatch_ab(do_shuffle: bool = True) Generator[BatchType, None, None]

A Background iterator to return augmented images, samples and targets.

The exit point from this class and the sole attribute that should be referenced. Called from plugins.train.trainer._base. Returns an iterator that yields images for training, preview and time-lapses.

Parameters:

do_shuffle (bool, optional) – Whether data should be shuffled prior to loading from disk. If true, each time the full list of filenames are processed, the data will be reshuffled to make sure they are not returned in the same order. Default: True

Yields:
  • feed (list) – 4-dimensional array of faces to feed the training the model (x parameter for keras.models.model.train_on_batch().). The array returned is in the format (batch size, height, width, channels).

  • targets (list) – List of 4-dimensional numpy.ndarray objects in the order and size of each output of the model. The format of these arrays will be (batch size, height, width, x). This is the y parameter for keras.models.model.train_on_batch(). The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)

process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) tuple[numpy.ndarray, list[numpy.ndarray]]

Override for processing the batch for the current generator.

Parameters:
  • filenames (list) – List of full paths to image file names for a single batch

  • images (numpy.ndarray) – The batch of faces corresponding to the filenames

  • detected_faces (list) – List of DetectedFace objects with aligned data and masks loaded for the current batch

  • batch (numpy.ndarray) – The pre-allocated batch with images and masks populated for the selected coverage and centering

Returns:

  • list – 4-dimensional array of faces to feed the training the model.

  • list – List of 4-dimensional numpy.ndarray. The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)

class lib.training.generator.Feeder(images: dict[T.Literal['a', 'b'], list[str]], model: ModelBase, batch_size: int, config: dict[str, ConfigValueType], include_preview: bool = True)

Bases: object

Handles the processing of a Batch for training the model and generating samples.

Parameters:
  • images (dict) – The list of full paths to the training images for this _Feeder for each side

  • model (plugin from plugins.train.model) – The selected model that will be running this trainer

  • batch_size (int) – The size of the batch to be processed for each side at each iteration

  • config (dict) – The configuration for this trainer

  • include_preview (bool, optional) – True to create a feeder for generating previews. Default: True

compile_sample(image_count: int, feed: dict[Literal['a', 'b'], numpy.ndarray], samples: dict[Literal['a', 'b'], numpy.ndarray], masks: dict[Literal['a', 'b'], numpy.ndarray]) dict[Literal['a', 'b'], list[numpy.ndarray]]

Compile the preview samples for display.

Parameters:
  • image_count (int) – The number of images to limit the sample output to.

  • feed (dict) – Dictionary for side “a”, “b” of numpy.ndarray. The images that should be fed into the model for obtaining a prediction

  • samples (dict) – Dictionary for side “a”, “b” of numpy.ndarray. The 100% coverage target images that should be used for creating the preview.

  • masks (dict) – Dictionary for side “a”, “b” of numpy.ndarray. The masks that should be used for creating the preview.

Returns:

The list of samples, targets and masks as numpy.ndarrays for creating a preview image

Return type:

list

generate_preview(is_timelapse: bool = False) dict[Literal['a', 'b'], list[numpy.ndarray]]

Generate the images for preview window or timelapse

Parameters:
  • is_timelapseTrue if preview is to be generated for a Timelapse otherwise False. Default: False

  • boolTrue if preview is to be generated for a Timelapse otherwise False. Default: False

  • optionalTrue if preview is to be generated for a Timelapse otherwise False. Default: False

Returns:

Dictionary for side A and B of list of numpy arrays corresponding to the samples, targets and masks for this preview

Return type:

dict

get_batch() tuple[list[list[numpy.ndarray]], ...]

Get the feed data and the targets for each training side for feeding into the model’s train function.

Returns:

  • model_inputs (list) – The inputs to the model for each side A and B

  • model_targets (list) – The targets for the model for each side A and B

set_timelapse_feed(images: dict[Literal['a', 'b'], list[str]], batch_size: int) None

Set the time-lapse feed for this feeder.

Creates a generator from lib.training_data.PreviewDataGenerator specifically for generating time-lapse previews for the feeder.

Parameters:
  • images (dict) – The list of full paths to the images for creating the time-lapse for each side

  • batch_size (int) – The number of images to be used to create the time-lapse preview.

class lib.training.generator.PreviewDataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)

Bases: DataGenerator

Generator for compiling images for generating previews.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles sample preview data for feeding the model’s predict function and for display.

Parameters:
  • model (ModelBase) – The model that this data generator is feeding

  • config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.

  • side ({'a' or 'b'}) – The side of the model that this iterator is for.

  • images (list) – A list of image paths that will be used to compile the final images.

  • batch_size (int) – The batch size for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.

process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) tuple[numpy.ndarray, list[numpy.ndarray]]

Creates the full size preview images and the sub-cropped images for feeding the model’s predict function.

Parameters:
  • filenames (list) – List of full paths to image file names for a single batch

  • images (numpy.ndarray) – The batch of faces corresponding to the filenames

  • detected_faces (list) – List of DetectedFace objects with aligned data and masks loaded for the current batch

  • batch (numpy.ndarray) – The pre-allocated batch with images and masks populated for the selected coverage and centering

Returns:

  • feed (numpy.ndarray) – List of 4-dimensional numpy.ndarray objects at model output size for feeding the model’s predict function. The first 3 channels are (rgb/bgr). The 4th channel is the face mask.

  • samples (list) – 4-dimensional array containing the 100% coverage images at the model’s centering for for generating previews. The array returned is in the format (batch size, height, width, channels).

class lib.training.generator.TrainingDataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)

Bases: DataGenerator

A Training Data Generator for compiling data for feeding to a model.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles augmented data, target data and sample data.

Parameters:
  • model (ModelBase) – The model that this data generator is feeding

  • config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.

  • side ({'a' or 'b'}) – The side of the model that this iterator is for.

  • images (list) – A list of image paths that will be used to compile the final augmented data from.

  • batch_size (int) – The batch size for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.

process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) tuple[numpy.ndarray, list[numpy.ndarray]]

Performs the augmentation and compiles target images and samples.

Parameters:
  • filenames (list) – List of full paths to image file names for a single batch

  • images (numpy.ndarray) – The batch of faces corresponding to the filenames

  • detected_faces (list) – List of DetectedFace objects with aligned data and masks loaded for the current batch

  • batch (numpy.ndarray) – The pre-allocated batch with images and masks populated for the selected coverage and centering

Returns:

  • feed (numpy.ndarray) – 4-dimensional array of faces to feed the training the model (x parameter for keras.models.model.train_on_batch().). The array returned is in the format (batch size, height, width, channels).

  • targets (list) – List of 4-dimensional numpy.ndarray objects in the order and size of each output of the model. The format of these arrays will be (batch size, height, width, x). This is the y parameter for keras.models.model.train_on_batch(). The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)

training.lr_finder module

Learning Rate Finder for faceswap.py.

class lib.training.lr_finder.LRStrength(value)

Bases: Enum

Enum for how aggressively to set the optimal learning rate

AGGRESSIVE = 5
DEFAULT = 10
EXTREME = 2.5
class lib.training.lr_finder.LearningRateFinder(model: ModelBase, config: dict[str, ConfigValueType], feeder: Feeder, stop_factor: int = 4, beta: float = 0.98)

Bases: object

Learning Rate Finder

Parameters:
  • model (tensorflow.keras.models.Model) – The keras model to find the optimal learning rate for

  • config (dict) – The configuration options for the model

  • feeder (Feeder) – The feeder for training the model

  • stop_factor (int) – When to stop finding the optimal learning rate

  • beta (float) – Amount to smooth loss by, for graphing purposes

find() bool

Find the optimal learning rate

Returns:

True if the learning rate was succesfully discovered otherwise False

Return type:

bool

training.preview_cv module

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_cv.PreviewBase(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], threading.Event] | None = None)

Bases: object

Parent class for OpenCV and Tkinter Preview Windows

Parameters:
  • preview_buffer (PreviewBuffer) – The thread safe object holding the preview images

  • triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

class lib.training.preview_cv.PreviewBuffer

Bases: object

A thread safe class for holding preview images

add_image(name: str, image: np.ndarray) None

Add an image to the preview buffer in a thread safe way

get_images() Generator[tuple[str, np.ndarray], None, None]

Get the latest images from the preview buffer. When iterator is exhausted clears the updated event.

Yields:
  • name (str) – The name of the image

  • numpy.ndarray – The image in BGR format

property is_updated: bool

True when new images have been loaded into the preview buffer

Type:

bool

class lib.training.preview_cv.PreviewCV(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], threading.Event])

Bases: PreviewBase

Simple fall back preview viewer using OpenCV for when TKinter is not available

Parameters:
  • preview_buffer (PreviewBuffer) – The thread safe object holding the preview images

  • triggers (dict) – Dictionary of event triggers for pop-up preview.

training.preview_tk module

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_tk.PreviewTk(preview_buffer: PreviewBuffer, parent: tk.Widget | None = None, taskbar: ttk.Frame | None = None, triggers: TriggerType | None = None)

Bases: PreviewBase

Holds a preview window for displaying the pop out preview.

Parameters:
  • preview_buffer (PreviewBuffer) – The thread safe object holding the preview images

  • parent (tkinter widget, optional) – If this viewer is being called from the GUI the parent widget should be passed in here. If this is a standalone pop-up window then pass None. Default: None

  • taskbar (tkinter.ttk.Frame, optional) – If this viewer is being called from the GUI the parent’s option frame should be passed in here. If this is a standalone pop-up window then pass None. Default: None

  • triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

property master_frame: Frame

The master frame that holds the preview window

Type:

tkinter.Frame

pack(*args, **kwargs)

Redirect calls to pack the widget to pack the actual _master_frame.

Takes standard tkinter.Frame pack arguments

remove_option_controls() None

Remove the taskbar options controls when the preview is disabled in the GUI

save(location: str) None

Save action to be performed when save button pressed from the GUI.

location: str

Full path to the folder to save the preview image to

lib.training.preview_tk.main()

Load image from first given argument and display

python -m lib.training.preview_tk <filename>