training package

The training Package handles the processing of faces for feeding into a Faceswap model.

training.augmentation module 

Processes the augmentation of images for feeding into a Faceswap model.

class lib.training.augmentation.AugConstants(config: dict[str, ConfigValueType], processing_size: int, batch_size: int)

Bases: object

Dataclass for holding constants for Image Augmentation.

Parameters:

config (dict[str, ConfigValueType]) – The user training configuration options
processing_size (int:) – The size of image to augment the data for
batch_size (int) – The batch size that augmented data is being prepared for

clahe_base_contrast: int

The base number for Contrast Limited Adaptive Histogram Equalization

Type:: int

clahe_chance: float

Probability to perform Contrast Limited Adaptive Histogram Equilization

Type:: float

clahe_max_size: int

Maximum clahe window size

Type:: int

lab_adjust: ndarray

Adjustment amounts for L*A*B augmentation

Type:: numpy.ndarray

transform_rotation: int

Rotation range for transformations

Type:: int

transform_shift: float

Shift range for transformations

Type:: float

transform_zoom: float

Zoom range for transformations

Type:: float

warp_lm_edge_anchors: ndarray

The edge anchors for landmark based warping

Type:: numpy.ndarray

warp_lm_grids: ndarray

The grids for landmark based warping

Type:: numpy.ndarray

warp_maps: ndarray: :class:`numpy.ndarray`The stacked (x, y) mappings for image warping

warp_pad: tuple[int, int]

The padding to apply for image warping

Type:: tuple[int, int]

warp_slices: slice

The slices for extracting a warped image

Type:: slice

class lib.training.augmentation.ImageAugmentation(batch_size: int, processing_size: int, config: dict[str, ConfigValueType])

Bases: object

Performs augmentation on batches of training images.

Parameters:

batch_size (int) – The number of images that will be fed through the augmentation functions at once.
processing_size (int) – The largest input or output size of the model. This is the size that images are processed at.
config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.

color_adjust(batch: ndarray) → ndarray

Perform color augmentation on the passed in batch.

The color adjustment parameters are set in config.train.ini

Parameters:: batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.
Returns:: A 4-dimensional array of the same shape as batch with color augmentation applied.
Return type:: numpy.ndarray

random_flip(batch: ndarray)

Perform random horizontal flipping on the passed in batch.

The probability of flipping an image is set in config.train.ini

Parameters:: batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.

transform(batch: ndarray)

Perform random transformation on the passed in batch.

The transformation parameters are set in config.train.ini

Parameters:: batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.

warp(batch: ndarray, to_landmarks: bool = False, **kwargs) → ndarray

Perform random warping on the passed in batch by one of two methods.

Parameters:

batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.
to_landmarks (bool, optional) – If False perform standard random warping of the input image. If True perform warping to semi-random similar corresponding landmarks from the other side. Default: False
kwargs (dict) –
If to_landmarks is True the following additional kwargs must be passed in:
- batch_src_points (numpy.ndarray) - A batch of 68 point landmarks for the source faces. This is a 3-dimensional array in the shape (batchsize, 68, 2).
- batch_dst_points (numpy.ndarray) - A batch of randomly chosen closest match destination faces landmarks. This is a 3-dimensional array in the shape (batchsize, 68, 2).

Returns:

A 4-dimensional array of the same shape as batch with warping applied.

Return type:

numpy.ndarray

training.cache module 

Holds the data cache for training data generators

class lib.training.cache.RingBuffer(batch_size: int, image_shape: tuple[int, int, int], buffer_size: int = 2, dtype: str = 'uint8')

Bases: object

Rolling buffer for holding training/preview batches

Parameters:

batch_size (int) – The batch size to create the buffer for
image_shape (tuple) – The height/width/channels shape of a single image in the batch
buffer_size (int, optional) – The number of arrays to hold in the rolling buffer. Default: 2
dtype (str, optional) – The datatype to create the buffer as. Default: “uint8”

lib.training.cache.get_cache(side: T.Literal['a', 'b'], filenames: list[str] | None = None, config: dict[str, ConfigValueType] | None = None, size: int | None = None, coverage_ratio: float | None = None) → _Cache

Obtain a _Cache object for the given side. If the object does not pre-exist then create it.

Parameters:

side (str) – “a” or “b”. The side of the model to obtain the cache for
filenames (list) – The filenames of all the images. This can either be the full path or the base name. If the full paths are passed in, they are stripped to base name for use as the cache key. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None
config (dict, optional) – The user selected training configuration options. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None
size (int, optional) – The largest output size of the model. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None
coverage_ratio (float: optional) – The coverage ratio that the model is using. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default: None

Returns:

The face meta information cache for the requested side

Return type:

_Cache

training.generator module 

Handles Data Augmentation for feeding Faceswap Models

class lib.training.generator.DataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)

Bases: object

Parent class for Training and Preview Data Generators.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles augmented data, target data and sample data.

Parameters:

model (ModelBase) – The model that this data generator is feeding
config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.
side ({'a' or 'b'}) – The side of the model that this iterator is for.
images (list) – A list of image paths that will be used to compile the final augmented data from.
batch_size (int) – The batch size for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.

minibatch_ab(do_shuffle: bool = True) → Generator[BatchType, None, None]

A Background iterator to return augmented images, samples and targets.

The exit point from this class and the sole attribute that should be referenced. Called from plugins.train.trainer._base. Returns an iterator that yields images for training, preview and time-lapses.

Parameters:

do_shuffle (bool, optional) – Whether data should be shuffled prior to loading from disk. If true, each time the full list of filenames are processed, the data will be reshuffled to make sure they are not returned in the same order. Default: True

Yields:

feed (list) – 4-dimensional array of faces to feed the training the model (x parameter for keras.models.model.train_on_batch().). The array returned is in the format (batch size, height, width, channels).
targets (list) – List of 4-dimensional numpy.ndarray objects in the order and size of each output of the model. The format of these arrays will be (batch size, height, width, x). This is the y parameter for keras.models.model.train_on_batch(). The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)

process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) → tuple[numpy.ndarray, list[numpy.ndarray]]

Override for processing the batch for the current generator.

Parameters:

filenames (list) – List of full paths to image file names for a single batch
images (numpy.ndarray) – The batch of faces corresponding to the filenames
detected_faces (list) – List of DetectedFace objects with aligned data and masks loaded for the current batch
batch (numpy.ndarray) – The pre-allocated batch with images and masks populated for the selected coverage and centering

Returns:

list – 4-dimensional array of faces to feed the training the model.
list – List of 4-dimensional numpy.ndarray. The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)

class lib.training.generator.Feeder(images: dict[T.Literal['a', 'b'], list[str]], model: ModelBase, batch_size: int, config: dict[str, ConfigValueType], include_preview: bool = True)

Bases: object

Handles the processing of a Batch for training the model and generating samples.

Parameters:

images (dict) – The list of full paths to the training images for this _Feeder for each side
model (plugin from plugins.train.model) – The selected model that will be running this trainer
batch_size (int) – The size of the batch to be processed for each side at each iteration
config (dict) – The configuration for this trainer
include_preview (bool, optional) – True to create a feeder for generating previews. Default: True

compile_sample(image_count: int, feed: dict[Literal['a', 'b'], numpy.ndarray], samples: dict[Literal['a', 'b'], numpy.ndarray], masks: dict[Literal['a', 'b'], numpy.ndarray]) → dict[Literal['a', 'b'], list[numpy.ndarray]]

Compile the preview samples for display.

Parameters:

image_count (int) – The number of images to limit the sample output to.
feed (dict) – Dictionary for side “a”, “b” of numpy.ndarray. The images that should be fed into the model for obtaining a prediction
samples (dict) – Dictionary for side “a”, “b” of numpy.ndarray. The 100% coverage target images that should be used for creating the preview.
masks (dict) – Dictionary for side “a”, “b” of numpy.ndarray. The masks that should be used for creating the preview.

Returns:

The list of samples, targets and masks as numpy.ndarrays for creating a preview image

Return type:

list

generate_preview(is_timelapse: bool = False) → dict[Literal['a', 'b'], list[numpy.ndarray]]

Generate the images for preview window or timelapse

Parameters:

is_timelapse – True if preview is to be generated for a Timelapse otherwise False. Default: False
bool – True if preview is to be generated for a Timelapse otherwise False. Default: False
optional – True if preview is to be generated for a Timelapse otherwise False. Default: False

Returns:

Dictionary for side A and B of list of numpy arrays corresponding to the samples, targets and masks for this preview

Return type:

dict

get_batch() → tuple[list[list[numpy.ndarray]], ...]

Get the feed data and the targets for each training side for feeding into the model’s train function.

Returns:

model_inputs (list) – The inputs to the model for each side A and B
model_targets (list) – The targets for the model for each side A and B

set_timelapse_feed(images: dict[Literal['a', 'b'], list[str]], batch_size: int) → None

Set the time-lapse feed for this feeder.

Creates a generator from lib.training_data.PreviewDataGenerator specifically for generating time-lapse previews for the feeder.

Parameters:

images (dict) – The list of full paths to the images for creating the time-lapse for each side
batch_size (int) – The number of images to be used to create the time-lapse preview.

class lib.training.generator.PreviewDataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)

Bases: DataGenerator

Generator for compiling images for generating previews.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles sample preview data for feeding the model’s predict function and for display.

Parameters:

model (ModelBase) – The model that this data generator is feeding
config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.
side ({'a' or 'b'}) – The side of the model that this iterator is for.
images (list) – A list of image paths that will be used to compile the final images.
batch_size (int) – The batch size for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.

process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) → tuple[numpy.ndarray, list[numpy.ndarray]]

Creates the full size preview images and the sub-cropped images for feeding the model’s predict function.

Parameters:

filenames (list) – List of full paths to image file names for a single batch
images (numpy.ndarray) – The batch of faces corresponding to the filenames
detected_faces (list) – List of DetectedFace objects with aligned data and masks loaded for the current batch
batch (numpy.ndarray) – The pre-allocated batch with images and masks populated for the selected coverage and centering

Returns:

feed (numpy.ndarray) – List of 4-dimensional numpy.ndarray objects at model output size for feeding the model’s predict function. The first 3 channels are (rgb/bgr). The 4th channel is the face mask.
samples (list) – 4-dimensional array containing the 100% coverage images at the model’s centering for for generating previews. The array returned is in the format (batch size, height, width, channels).

class lib.training.generator.TrainingDataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)

Bases: DataGenerator

A Training Data Generator for compiling data for feeding to a model.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles augmented data, target data and sample data.

Parameters:

model (ModelBase) – The model that this data generator is feeding
config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.
side ({'a' or 'b'}) – The side of the model that this iterator is for.
images (list) – A list of image paths that will be used to compile the final augmented data from.
batch_size (int) – The batch size for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.

process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) → tuple[numpy.ndarray, list[numpy.ndarray]]

Performs the augmentation and compiles target images and samples.

Parameters:

filenames (list) – List of full paths to image file names for a single batch
images (numpy.ndarray) – The batch of faces corresponding to the filenames
detected_faces (list) – List of DetectedFace objects with aligned data and masks loaded for the current batch
batch (numpy.ndarray) – The pre-allocated batch with images and masks populated for the selected coverage and centering

Returns:

feed (numpy.ndarray) – 4-dimensional array of faces to feed the training the model (x parameter for keras.models.model.train_on_batch().). The array returned is in the format (batch size, height, width, channels).
targets (list) – List of 4-dimensional numpy.ndarray objects in the order and size of each output of the model. The format of these arrays will be (batch size, height, width, x). This is the y parameter for keras.models.model.train_on_batch(). The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)

training.lr_finder module 

Learning Rate Finder for faceswap.py.

class lib.training.lr_finder.LRStrength(value)

Bases: Enum

Enum for how aggressively to set the optimal learning rate

AGGRESSIVE = 5

DEFAULT = 10

EXTREME = 2.5

class lib.training.lr_finder.LearningRateFinder(model: ModelBase, config: dict[str, ConfigValueType], feeder: Feeder, stop_factor: int = 4, beta: float = 0.98)

Bases: object

Learning Rate Finder

Parameters:

model (tensorflow.keras.models.Model) – The keras model to find the optimal learning rate for
config (dict) – The configuration options for the model
feeder (Feeder) – The feeder for training the model
stop_factor (int) – When to stop finding the optimal learning rate
beta (float) – Amount to smooth loss by, for graphing purposes

find() → bool

Find the optimal learning rate

Returns:: True if the learning rate was succesfully discovered otherwise False
Return type:: bool

training.preview_cv module 

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_cv.PreviewBase(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], threading.Event] | None = None)

Bases: object

Parent class for OpenCV and Tkinter Preview Windows

Parameters:

preview_buffer (PreviewBuffer) – The thread safe object holding the preview images
triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

class lib.training.preview_cv.PreviewBuffer

Bases: object

A thread safe class for holding preview images

add_image(name: str, image: np.ndarray) → None: Add an image to the preview buffer in a thread safe way

get_images() → Generator[tuple[str, np.ndarray], None, None]

Get the latest images from the preview buffer. When iterator is exhausted clears the updated event.

Yields:

name (str) – The name of the image
numpy.ndarray – The image in BGR format

property is_updated: bool

True when new images have been loaded into the preview buffer

Type:: bool

class lib.training.preview_cv.PreviewCV(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], threading.Event])

Bases: PreviewBase

Simple fall back preview viewer using OpenCV for when TKinter is not available

Parameters:

preview_buffer (PreviewBuffer) – The thread safe object holding the preview images
triggers (dict) – Dictionary of event triggers for pop-up preview.

training.preview_tk module 

The pop up preview window for Faceswap.

If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow

class lib.training.preview_tk.PreviewTk(preview_buffer: PreviewBuffer, parent: tk.Widget | None = None, taskbar: ttk.Frame | None = None, triggers: TriggerType | None = None)

Bases: PreviewBase

Holds a preview window for displaying the pop out preview.

Parameters:

preview_buffer (PreviewBuffer) – The thread safe object holding the preview images
parent (tkinter widget, optional) – If this viewer is being called from the GUI the parent widget should be passed in here. If this is a standalone pop-up window then pass None. Default: None
taskbar (tkinter.ttk.Frame, optional) – If this viewer is being called from the GUI the parent’s option frame should be passed in here. If this is a standalone pop-up window then pass None. Default: None
triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None

property master_frame: Frame

The master frame that holds the preview window

Type:: tkinter.Frame

pack(*args, **kwargs)

Redirect calls to pack the widget to pack the actual _master_frame.

Takes standard tkinter.Frame pack arguments

remove_option_controls() → None: Remove the taskbar options controls when the preview is disabled in the GUI

save(location: str) → None

Save action to be performed when save button pressed from the GUI.

location: str: Full path to the folder to save the preview image to

lib.training.preview_tk.main()

Load image from first given argument and display

python -m lib.training.preview_tk <filename>

training package

training.augmentation module

training.cache module

training.generator module

training.lr_finder module

training.preview_cv module

training.preview_tk module

training.augmentation module 

training.cache module 

training.generator module 

training.lr_finder module 

training.preview_cv module 

training.preview_tk module 