training package

The training Package handles the processing of faces for feeding into a Faceswap model.

augmentation module

Processes the augmentation of images for feeding into a Faceswap model.

class lib.training.augmentation.ImageAugmentation(batchsize, is_display, input_size, output_shapes, coverage_ratio, config)

Bases: object

Performs augmentation on batches of training images.

Parameters:
  • batchsize (int) – The number of images that will be fed through the augmentation functions at once.
  • is_display (bool) – Whether the images being fed through will be used for Preview or Time-lapse. Disables the “warp” augmentation for these images.
  • input_size (int) – The expected input size for the model. It is assumed that the input to the model is always a square image. This is the size, in pixels, of the width and the height of the input to the model.
  • output_shapes (list) – A list of tuples defining the output shapes from the model, in the order that the outputs are returned. The tuples should be in (height, width, channels) format.
  • coverage_ratio (float) – The ratio of the training image to be trained on. Dictates how much of the image will be cropped out. E.G: a coverage ratio of 0.625 will result in cropping a 160px box from a 256px image (\(256 * 0.625 = 160\))
  • config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.
initialized

Flag to indicate whether ImageAugmentation has been initialized with the training image size in order to cache certain augmentation operations (see initialize())

Type:bool
is_display

Flag to indicate whether these augmentations are for time-lapses/preview images (True) or standard training data (False)

Type:bool
color_adjust(batch)

Perform color augmentation on the passed in batch.

The color adjustment parameters are set in config.train.ini

Parameters:batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.
Returns:A 4-dimensional array of the same shape as batch with color augmentation applied.
Return type:numpy.ndarray
get_targets(batch)

Returns the target images, and masks, if required.

Parameters:batch (numpy.ndarray) –

This should be a 4+-dimensional array of training images in the format (batchsize, height, width, channels). Targets should be requested after performing image transformations but prior to performing warps.

The 4th channel should be the mask. Any channels above the 4th should be any additional masks that are requested.

Returns:The following keys will be within the returned dictionary:
  • targets (list) - A list of 4-dimensional numpy.ndarray s in the order and size of each output of the model as defined in output_shapes. The format of these arrays will be (batchsize, height, width, 3). NB: masks are not included in the targets list. If masks are to be included in the output they will be returned as their own item from the masks key.
  • masks (numpy.ndarray) - A 4-dimensional array containing the target masks in the format (batchsize, height, width, 1).
Return type:dict
initialize(training_size)

Initializes the caching of constants for use in various image augmentations.

The training image size is not known prior to loading the images from disk and commencing training, so it cannot be set in the __init__() method. When the first training batch is loaded this function should be called to initialize the class and perform various calculations based on this input size to cache certain constants for image augmentation calculations.

Parameters:training_size (int) – The size of the training images stored on disk that are to be fed into ImageAugmentation. The training images should always be square and of the same size. This is the size, in pixels, of the width and the height of the training images.
random_flip(batch)

Perform random horizontal flipping on the passed in batch.

The probability of flipping an image is set in config.train.ini

Parameters:batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.
Returns:A 4-dimensional array of the same shape as batch with transformation applied.
Return type:numpy.ndarray
skip_warp(batch)

Returns the images resized and cropped for feeding the model, if warping has been disabled.

Parameters:batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.
Returns:The given batch cropped and resized for feeding the model
Return type:numpy.ndarray
transform(batch)

Perform random transformation on the passed in batch.

The transformation parameters are set in config.train.ini

Parameters:batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.
Returns:A 4-dimensional array of the same shape as batch with transformation applied.
Return type:numpy.ndarray
warp(batch, to_landmarks=False, **kwargs)

Perform random warping on the passed in batch by one of two methods.

Parameters:
  • batch (numpy.ndarray) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.
  • to_landmarks (bool, optional) – If False perform standard random warping of the input image. If True perform warping to semi-random similar corresponding landmarks from the other side. Default: False
  • kwargs (dict) –

    If to_landmarks is True the following additional kwargs must be passed in:

    • batch_src_points (numpy.ndarray) - A batch of 68 point landmarks for the source faces. This is a 3-dimensional array in the shape (batchsize, 68, 2).
    • batch_dst_points (numpy.ndarray) - A batch of randomly chosen closest match destination faces landmarks. This is a 3-dimensional array in the shape (batchsize, 68, 2).
Returns:

A 4-dimensional array of the same shape as batch with warping applied.

Return type:

numpy.ndarray

generator module

Handles Data Augmentation for feeding Faceswap Models

class lib.training.generator.TrainingDataGenerator(model_input_size, model_output_shapes, coverage_ratio, color_order, augment_color, no_flip, no_warp, warp_to_landmarks, config)

Bases: object

A Training Data Generator for compiling data for feeding to a model.

This class is called from plugins.train.trainer._base and launches a background iterator that compiles augmented data, target data and sample data.

Parameters:
  • model_input_size (int) – The expected input size for the model. It is assumed that the input to the model is always a square image. This is the size, in pixels, of the width and the height of the input to the model.
  • model_output_shapes (list) – A list of tuples defining the output shapes from the model, in the order that the outputs are returned. The tuples should be in (height, width, channels) format.
  • coverage_ratio (float) – The ratio of the training image to be trained on. Dictates how much of the image will be cropped out. E.G: a coverage ratio of 0.625 will result in cropping a 160px box from a 256px image (\(256 * 0.625 = 160\)).
  • color_order (["rgb", "bgr"]) – The color order that the model expects as input
  • augment_color (bool) – True if color is to be augmented, otherwise False
  • no_flip (bool) – True if the image shouldn’t be randomly flipped as part of augmentation, otherwise False
  • no_warp (bool) – True if the image shouldn’t be warped as part of augmentation, otherwise False
  • warp_to_landmarks (bool) – True if the random warp method should warp to similar landmarks from the other side, False if the standard random warp method should be used.
  • face_cache (dict) – A thread safe dictionary containing a cache of information relating to all faces being trained on
  • config (dict) – The configuration dict generated from config.train.ini containing the trainer plugin configuration options.
minibatch_ab(images, batchsize, side, do_shuffle=True, is_preview=False, is_timelapse=False)

A Background iterator to return augmented images, samples and targets.

The exit point from this class and the sole attribute that should be referenced. Called from plugins.train.trainer._base. Returns an iterator that yields images for training, preview and time-lapses.

Parameters:
  • images (list) – A list of image paths that will be used to compile the final augmented data from.
  • batchsize (int) – The batchsize for this iterator. Images will be returned in numpy.ndarray objects of this size from the iterator.
  • side ({'a' or 'b'}) – The side of the model that this iterator is for.
  • do_shuffle (bool, optional) – Whether data should be shuffled prior to loading from disk. If true, each time the full list of filenames are processed, the data will be reshuffled to make sure they are not returned in the same order. Default: True
  • is_preview (bool, optional) – Indicates whether this iterator is generating preview images. If True then certain augmentations will not be performed. Default: False
  • is_timelapse (bool optional) – Indicates whether this iterator is generating time-lapse images. If True, then certain augmentations will not be performed. Default: False
Yields:

dict – The following items are contained in each dict yielded from this iterator:

  • feed (numpy.ndarray) - The feed for the model. The array returned is in the format (batchsize, height, width, channels). This is the x parameter for keras.models.model.train_on_batch().
  • targets (list) - A list of 4-dimensional numpy.ndarray objects in the order and size of each output of the model as defined in model_output_shapes. the format of these arrays will be (batchsize, height, width, 3). This is the y parameter for keras.models.model.train_on_batch() NB: masks are not included in the targets list. If required for feeding into the Keras model, they will need to be added to this list in plugins.train.trainer._base from the masks key.
  • masks (numpy.ndarray) - A 4-dimensional array containing the target masks in the format (batchsize, height, width, 1).
  • samples (numpy.ndarray) - A 4-dimensional array containing the samples for feeding to the model’s predict function for generating preview and time-lapse samples. The array will be in the format (batchsize, height, width, channels). NB: This item will only exist in the dict if is_preview or is_timelapse is True