training package
The training Package handles the processing of faces for feeding into a Faceswap model.
training.augmentation module
Processes the augmentation of images for feeding into a Faceswap model.
- class lib.training.augmentation.AugConstants(config: dict[str, ConfigValueType], processing_size: int, batch_size: int)
Bases:
object
Dataclass for holding constants for Image Augmentation.
- Parameters:
config (dict[str, ConfigValueType]) – The user training configuration options
pricessing_size (int:) – The size of image to augment the data for
batch_size (int) – The batch size that augmented data is being prepared for
- clahe_base_contrast: int
The base number for Contrast Limited Adaptive Histogram Equalization
- Type:
int
- clahe_chance: float
Probability to perform Contrast Limited Adaptive Histogram Equilization
- Type:
float
- clahe_max_size: int
Maximum clahe window size
- Type:
int
- lab_adjust: ndarray
Adjustment amounts for L*A*B augmentation
- Type:
numpy.ndarray
- transform_rotation: int
Rotation range for transformations
- Type:
int
- transform_shift: float
Shift range for transformations
- Type:
float
- transform_zoom: float
Zoom range for transformations
- Type:
float
- warp_lm_edge_anchors: ndarray
The edge anchors for landmark based warping
- Type:
numpy.ndarray
- warp_lm_grids: ndarray
The grids for landmark based warping
- Type:
numpy.ndarray
- warp_pad: tuple[int, int]
The padding to apply for image warping
- Type:
tuple[int, int]
- warp_slices: slice
The slices for extracting a warped image
- Type:
slice
- class lib.training.augmentation.ImageAugmentation(batch_size: int, processing_size: int, config: dict[str, ConfigValueType])
Bases:
object
Performs augmentation on batches of training images.
- Parameters:
batch_size (int) – The number of images that will be fed through the augmentation functions at once.
processing_size (int) – The largest input or output size of the model. This is the size that images are processed at.
config (dict) – The configuration dict generated from
config.train.ini
containing the trainer plugin configuration options.
- color_adjust(batch: ndarray) ndarray
Perform color augmentation on the passed in batch.
The color adjustment parameters are set in
config.train.ini
- Parameters:
batch (
numpy.ndarray
) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.- Returns:
A 4-dimensional array of the same shape as
batch
with color augmentation applied.- Return type:
numpy.ndarray
- random_flip(batch: ndarray)
Perform random horizontal flipping on the passed in batch.
The probability of flipping an image is set in
config.train.ini
- Parameters:
batch (
numpy.ndarray
) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.
- transform(batch: ndarray)
Perform random transformation on the passed in batch.
The transformation parameters are set in
config.train.ini
- Parameters:
batch (
numpy.ndarray
) – The batch should be a 4-dimensional array of shape (batchsize, height, width, channels) and in BGR format.
- warp(batch: ndarray, to_landmarks: bool = False, **kwargs) ndarray
Perform random warping on the passed in batch by one of two methods.
- Parameters:
batch (
numpy.ndarray
) – The batch should be a 4-dimensional array of shape (batchsize, height, width, 3) and in BGR format.to_landmarks (bool, optional) – If
False
perform standard random warping of the input image. IfTrue
perform warping to semi-random similar corresponding landmarks from the other side. Default:False
kwargs (dict) –
If
to_landmarks
isTrue
the following additional kwargs must be passed in:batch_src_points (
numpy.ndarray
) - A batch of 68 point landmarks for the source faces. This is a 3-dimensional array in the shape (batchsize, 68, 2).batch_dst_points (
numpy.ndarray
) - A batch of randomly chosen closest match destination faces landmarks. This is a 3-dimensional array in the shape (batchsize, 68, 2).
- Returns:
A 4-dimensional array of the same shape as
batch
with warping applied.- Return type:
numpy.ndarray
training.cache module
Holds the data cache for training data generators
- class lib.training.cache.RingBuffer(batch_size: int, image_shape: tuple[int, int, int], buffer_size: int = 2, dtype: str = 'uint8')
Bases:
object
Rolling buffer for holding training/preview batches
- Parameters:
batch_size (int) – The batch size to create the buffer for
image_shape (tuple) – The height/width/channels shape of a single image in the batch
buffer_size (int, optional) – The number of arrays to hold in the rolling buffer. Default: 2
dtype (str, optional) – The datatype to create the buffer as. Default: “uint8”
- lib.training.cache.get_cache(side: T.Literal['a', 'b'], filenames: list[str] | None = None, config: dict[str, ConfigValueType] | None = None, size: int | None = None, coverage_ratio: float | None = None) _Cache
Obtain a
_Cache
object for the given side. If the object does not pre-exist then create it.- Parameters:
side (str) – “a” or “b”. The side of the model to obtain the cache for
filenames (list) – The filenames of all the images. This can either be the full path or the base name. If the full paths are passed in, they are stripped to base name for use as the cache key. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default:
None
config (dict, optional) – The user selected training configuration options. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default:
None
size (int, optional) – The largest output size of the model. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default:
None
coverage_ratio (float: optional) – The coverage ratio that the model is using. Must be passed for the first call of this function for each side. For subsequent calls this parameter is ignored. Default:
None
- Returns:
The face meta information cache for the requested side
- Return type:
_Cache
training.generator module
Handles Data Augmentation for feeding Faceswap Models
- class lib.training.generator.DataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)
Bases:
object
Parent class for Training and Preview Data Generators.
This class is called from
plugins.train.trainer._base
and launches a background iterator that compiles augmented data, target data and sample data.- Parameters:
model (
ModelBase
) – The model that this data generator is feedingconfig (dict) – The configuration dict generated from
config.train.ini
containing the trainer plugin configuration options.side ({'a' or 'b'}) – The side of the model that this iterator is for.
images (list) – A list of image paths that will be used to compile the final augmented data from.
batch_size (int) – The batch size for this iterator. Images will be returned in
numpy.ndarray
objects of this size from the iterator.
- minibatch_ab(do_shuffle: bool = True) Generator[BatchType, None, None]
A Background iterator to return augmented images, samples and targets.
The exit point from this class and the sole attribute that should be referenced. Called from
plugins.train.trainer._base
. Returns an iterator that yields images for training, preview and time-lapses.- Parameters:
do_shuffle (bool, optional) – Whether data should be shuffled prior to loading from disk. If true, each time the full list of filenames are processed, the data will be reshuffled to make sure they are not returned in the same order. Default:
True
- Yields:
feed (list) – 4-dimensional array of faces to feed the training the model (
x
parameter forkeras.models.model.train_on_batch()
.). The array returned is in the format (batch size, height, width, channels).targets (list) – List of 4-dimensional
numpy.ndarray
objects in the order and size of each output of the model. The format of these arrays will be (batch size, height, width, x). This is they
parameter forkeras.models.model.train_on_batch()
. The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)
- process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) tuple[numpy.ndarray, list[numpy.ndarray]]
Override for processing the batch for the current generator.
- Parameters:
filenames (list) – List of full paths to image file names for a single batch
images (
numpy.ndarray
) – The batch of faces corresponding to the filenamesdetected_faces (list) – List of
DetectedFace
objects with aligned data and masks loaded for the current batchbatch (
numpy.ndarray
) – The pre-allocated batch with images and masks populated for the selected coverage and centering
- Returns:
list – 4-dimensional array of faces to feed the training the model.
list – List of 4-dimensional
numpy.ndarray
. The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)
- class lib.training.generator.Feeder(images: dict[T.Literal['a', 'b'], list[str]], model: ModelBase, batch_size: int, config: dict[str, ConfigValueType], include_preview: bool = True)
Bases:
object
Handles the processing of a Batch for training the model and generating samples.
- Parameters:
images (dict) – The list of full paths to the training images for this
_Feeder
for each sidemodel (plugin from
plugins.train.model
) – The selected model that will be running this trainerbatch_size (int) – The size of the batch to be processed for each side at each iteration
config (dict) – The configuration for this trainer
include_preview (bool, optional) –
True
to create a feeder for generating previews. Default:True
- compile_sample(image_count: int, feed: dict[Literal['a', 'b'], numpy.ndarray], samples: dict[Literal['a', 'b'], numpy.ndarray], masks: dict[Literal['a', 'b'], numpy.ndarray]) dict[Literal['a', 'b'], list[numpy.ndarray]]
Compile the preview samples for display.
- Parameters:
image_count (int) – The number of images to limit the sample output to.
feed (dict) – Dictionary for side “a”, “b” of
numpy.ndarray
. The images that should be fed into the model for obtaining a predictionsamples (dict) – Dictionary for side “a”, “b” of
numpy.ndarray
. The 100% coverage target images that should be used for creating the preview.masks (dict) – Dictionary for side “a”, “b” of
numpy.ndarray
. The masks that should be used for creating the preview.
- Returns:
The list of samples, targets and masks as
numpy.ndarrays
for creating a preview image- Return type:
list
- generate_preview(is_timelapse: bool = False) dict[Literal['a', 'b'], list[numpy.ndarray]]
Generate the images for preview window or timelapse
- Parameters:
is_timelapse –
True
if preview is to be generated for a Timelapse otherwiseFalse
. Default:False
bool –
True
if preview is to be generated for a Timelapse otherwiseFalse
. Default:False
optional –
True
if preview is to be generated for a Timelapse otherwiseFalse
. Default:False
- Returns:
Dictionary for side A and B of list of numpy arrays corresponding to the samples, targets and masks for this preview
- Return type:
dict
- get_batch() tuple[list[list[numpy.ndarray]], ...]
Get the feed data and the targets for each training side for feeding into the model’s train function.
- Returns:
model_inputs (list) – The inputs to the model for each side A and B
model_targets (list) – The targets for the model for each side A and B
- set_timelapse_feed(images: dict[Literal['a', 'b'], list[str]], batch_size: int) None
Set the time-lapse feed for this feeder.
Creates a generator from
lib.training_data.PreviewDataGenerator
specifically for generating time-lapse previews for the feeder.- Parameters:
images (dict) – The list of full paths to the images for creating the time-lapse for each side
batch_size (int) – The number of images to be used to create the time-lapse preview.
- class lib.training.generator.PreviewDataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)
Bases:
DataGenerator
Generator for compiling images for generating previews.
This class is called from
plugins.train.trainer._base
and launches a background iterator that compiles sample preview data for feeding the model’s predict function and for display.- Parameters:
model (
ModelBase
) – The model that this data generator is feedingconfig (dict) – The configuration dict generated from
config.train.ini
containing the trainer plugin configuration options.side ({'a' or 'b'}) – The side of the model that this iterator is for.
images (list) – A list of image paths that will be used to compile the final images.
batch_size (int) – The batch size for this iterator. Images will be returned in
numpy.ndarray
objects of this size from the iterator.
- process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) tuple[numpy.ndarray, list[numpy.ndarray]]
Creates the full size preview images and the sub-cropped images for feeding the model’s predict function.
- Parameters:
filenames (list) – List of full paths to image file names for a single batch
images (
numpy.ndarray
) – The batch of faces corresponding to the filenamesdetected_faces (list) – List of
DetectedFace
objects with aligned data and masks loaded for the current batchbatch (
numpy.ndarray
) – The pre-allocated batch with images and masks populated for the selected coverage and centering
- Returns:
feed (
numpy.ndarray
) – List of 4-dimensionalnumpy.ndarray
objects at model output size for feeding the model’s predict function. The first 3 channels are (rgb/bgr). The 4th channel is the face mask.samples (list) – 4-dimensional array containing the 100% coverage images at the model’s centering for for generating previews. The array returned is in the format (batch size, height, width, channels).
- class lib.training.generator.TrainingDataGenerator(config: dict[str, ConfigValueType], model: ModelBase, side: T.Literal['a', 'b'], images: list[str], batch_size: int)
Bases:
DataGenerator
A Training Data Generator for compiling data for feeding to a model.
This class is called from
plugins.train.trainer._base
and launches a background iterator that compiles augmented data, target data and sample data.- Parameters:
model (
ModelBase
) – The model that this data generator is feedingconfig (dict) – The configuration dict generated from
config.train.ini
containing the trainer plugin configuration options.side ({'a' or 'b'}) – The side of the model that this iterator is for.
images (list) – A list of image paths that will be used to compile the final augmented data from.
batch_size (int) – The batch size for this iterator. Images will be returned in
numpy.ndarray
objects of this size from the iterator.
- process_batch(filenames: list[str], images: ndarray, detected_faces: list[lib.align.detected_face.DetectedFace], batch: ndarray) tuple[numpy.ndarray, list[numpy.ndarray]]
Performs the augmentation and compiles target images and samples.
- Parameters:
filenames (list) – List of full paths to image file names for a single batch
images (
numpy.ndarray
) – The batch of faces corresponding to the filenamesdetected_faces (list) – List of
DetectedFace
objects with aligned data and masks loaded for the current batchbatch (
numpy.ndarray
) – The pre-allocated batch with images and masks populated for the selected coverage and centering
- Returns:
feed (
numpy.ndarray
) – 4-dimensional array of faces to feed the training the model (x
parameter forkeras.models.model.train_on_batch()
.). The array returned is in the format (batch size, height, width, channels).targets (list) – List of 4-dimensional
numpy.ndarray
objects in the order and size of each output of the model. The format of these arrays will be (batch size, height, width, x). This is they
parameter forkeras.models.model.train_on_batch()
. The number of channels here will vary. The first 3 channels are (rgb/bgr). The 4th channel is the face mask. Any subsequent channels are area masks (e.g. eye/mouth masks)
training.lr_finder module
Learning Rate Finder for faceswap.py.
- class lib.training.lr_finder.LRStrength(value)
Bases:
Enum
Enum for how aggressively to set the optimal learning rate
- AGGRESSIVE = 5
- DEFAULT = 10
- EXTREME = 2.5
- class lib.training.lr_finder.LearningRateFinder(model: ModelBase, config: dict[str, ConfigValueType], feeder: Feeder, stop_factor: int = 4, beta: float = 0.98)
Bases:
object
Learning Rate Finder
- Parameters:
model (
tensorflow.keras.models.Model
) – The keras model to find the optimal learning rate forconfig (dict) – The configuration options for the model
feeder (
Feeder
) – The feeder for training the modelstop_factor (int) – When to stop finding the optimal learning rate
beta (float) – Amount to smooth loss by, for graphing purposes
- find() bool
Find the optimal learning rate
- Returns:
True
if the learning rate was succesfully discovered otherwiseFalse
- Return type:
bool
training.preview_cv module
The pop up preview window for Faceswap.
If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow
- class lib.training.preview_cv.PreviewBase(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], threading.Event] | None = None)
Bases:
object
Parent class for OpenCV and Tkinter Preview Windows
- Parameters:
preview_buffer (
PreviewBuffer
) – The thread safe object holding the preview imagestriggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None
- class lib.training.preview_cv.PreviewBuffer
Bases:
object
A thread safe class for holding preview images
- add_image(name: str, image: np.ndarray) None
Add an image to the preview buffer in a thread safe way
- get_images() Generator[tuple[str, np.ndarray], None, None]
Get the latest images from the preview buffer. When iterator is exhausted clears the
updated
event.- Yields:
name (str) – The name of the image
numpy.ndarray
– The image in BGR format
- property is_updated: bool
True
when new images have been loaded into the preview buffer- Type:
bool
- class lib.training.preview_cv.PreviewCV(preview_buffer: PreviewBuffer, triggers: dict[Literal['toggle_mask', 'refresh', 'save', 'quit', 'shutdown'], threading.Event])
Bases:
PreviewBase
Simple fall back preview viewer using OpenCV for when TKinter is not available
- Parameters:
preview_buffer (
PreviewBuffer
) – The thread safe object holding the preview imagestriggers (dict) – Dictionary of event triggers for pop-up preview.
training.preview_tk module
The pop up preview window for Faceswap.
If Tkinter is installed, then this will be used to manage the preview image, otherwise we fallback to opencv’s imshow
- class lib.training.preview_tk.PreviewTk(preview_buffer: PreviewBuffer, parent: tk.Widget | None = None, taskbar: ttk.Frame | None = None, triggers: TriggerType | None = None)
Bases:
PreviewBase
Holds a preview window for displaying the pop out preview.
- Parameters:
preview_buffer (
PreviewBuffer
) – The thread safe object holding the preview imagesparent (tkinter widget, optional) – If this viewer is being called from the GUI the parent widget should be passed in here. If this is a standalone pop-up window then pass
None
. Default:None
taskbar (
tkinter.ttk.Frame
, optional) – If this viewer is being called from the GUI the parent’s option frame should be passed in here. If this is a standalone pop-up window then passNone
. Default:None
triggers (dict, optional) – Dictionary of event triggers for pop-up preview. Not required when running inside the GUI. Default: None
- property master_frame: Frame
The master frame that holds the preview window
- Type:
tkinter.Frame
- pack(*args, **kwargs)
Redirect calls to pack the widget to pack the actual
_master_frame
.Takes standard
tkinter.Frame
pack arguments
- remove_option_controls() None
Remove the taskbar options controls when the preview is disabled in the GUI
- save(location: str) None
Save action to be performed when save button pressed from the GUI.
- location: str
Full path to the folder to save the preview image to
- lib.training.preview_tk.main()
Load image from first given argument and display
python -m lib.training.preview_tk <filename>