train package
The Train Package handles the Model and Trainer plugins for training models in Faceswap.
model package
This package contains various helper functions that plugins can inherit from
plugins.train.model._base.inference Module
Handles the recompilation of a Faceswap model into a version that can be used for inference
- class plugins.train.model._base.inference.Inference(saved_model: Model, switch_sides: bool)
Calculates required layers and compiles a saved model for inference.
- Parameters:
saved_model (
keras.Model) – The saved trained Faceswap modelswitch_sides (bool) –
Trueif the swap should be performed “B” > “A”Falseif the swap should be “A” > “B”
- __call__() Model
Obtain the inference model.
- Return type:
The built Keras inference model for the requested swap side
Classes
|
Calculates required layers and compiles a saved model for inference. |
plugins.train.model._base.io Module
IO handling for the model base plugin.
The objects in this module should not be called directly, but are called from
ModelBase
- This module handles:
The loading, saving and backing up of keras models to and from disk.
The loading and freezing of weights for model plugins.
- class plugins.train.model._base.io.IO(plugin: ModelBase, model_dir: str, is_predict: bool, save_optimizer: T.Literal['never', 'always', 'exit'])
Model saving and loading functions.
Handles the loading and saving of the plugin model from disk as well as the model backup and snapshot functions.
- Parameters:
plugin (ModelBase) – The parent plugin class that owns the IO functions.
model_dir (str) – The full path to the model save location
is_predict (bool) –
Trueif the model is being loaded for inference.Falseif the model is being loaded for training.save_optimizer (T.Literal['never', 'always', 'exit']) – When to save the optimizer weights. “never” never saves the optimizer weights. “always” always saves the optimizer weights. “exit” only saves the optimizer weights on an exit request.
- property filename: str
The filename for this model.
- property history: list[float]
list of loss history for the current save iteration.
- load() Model
Loads the model from disk
If the predict function is to be called and the model cannot be found in the model folder then an error is logged and the process exits.
When loading the model, the plugin model folder is scanned for custom layers which are added to Keras’ custom objects.
- Return type:
The saved model loaded from disk
- load_optimizer() dict[str, Any] | None
Load the optimizer’s state_dict from the .keras model file
- Return type:
The saved optimizer state_dict or
Noneif it does not exist
- property model_dir: str
The full path to the model folder
- property model_exists: bool
Trueif a model of the type being loaded exists within the model folder location otherwiseFalse.
- property multiple_models_in_folder: list[str] | None
If there are multiple model types in the requested folder, or model types that don’t correspond to the requested plugin type, then returns the list of plugin names that exist in the folder, otherwise returns
None
- save(optimizer: Optimizer | None = None, is_exit: bool = False) None
Backup and save the model and state file.
- Parameters:
optimizer (Optimizer | None) – The current optimizer in use for the model if it should be saved. Default:
Noneis_exit (bool) –
Trueif the save request has come from an exit process request otherwiseFalse. Default:False
- Return type:
None
- snapshot() None
Perform a model snapshot.
Notes
Snapshot function is called 1 iteration after the model was saved, so that it is built from the latest save, hence iteration being reduced by 1.
- Return type:
None
- class plugins.train.model._base.io.OptimizerMigrate(config: dict[str, Any], model_path: str)
Migrates weights from a keras optimizer to a torch optimizer’s state dict
- Parameters:
config (dict[str, T.Any])
model_path (str)
- convert() dict[str, Any] | None
Convert the keras optimizer from a keras model file into a torch optimizer state dict
- Returns:
The optimizer state dict for loading into a torch optimizer or
Noneif no savedoptimizer exists
- Return type:
dict[str, Any] | None
- class plugins.train.model._base.io.Weights(plugin: ModelBase)
Handling of freezing and loading model weights
- Parameters:
plugin (ModelBase) – The parent plugin class that owns the IO functions.
- freeze() None
If freeze has been selected in the cli arguments, then freeze those models indicated in the plugin’s configuration.
- Return type:
None
- load(model_exists: bool) None
Load weights for newly created models, or output warning for pre-existing models.
- Parameters:
model_exists (bool) –
Trueif a model pre-exists and is being resumed,Falseif this is a new model- Return type:
None
- plugins.train.model._base.io.get_all_sub_models(model: Model, models: list[Model] | None = None) list[Model]
For a given model, return all sub-models that occur (recursively) as children.
- Parameters:
model (Model) – A Keras model to scan for sub models
models (list[Model] | None) – Do not provide this parameter. It is used for recursion
- Returns:
A list of all
keras.models.Modelobjects found within the given model. The providedmodel will always be returned in the first position
- Return type:
list[Model]
Functions
|
For a given model, return all sub-models that occur (recursively) as children. |
Classes
|
Model saving and loading functions. |
|
Migrates weights from a keras optimizer to a torch optimizer's state dict |
|
Handling of freezing and loading model weights |
plugins.train.model._base.model Module
Base class for Models. ALL Models should at least inherit from this class.
See original for an annotated example for how to create model plugins.
- class plugins.train.model._base.model.ModelBase(model_dir: str, arguments: argparse.Namespace, predict: bool = False)
Base class that all model plugins should inherit from.
- Parameters:
model_dir (str) – The full path to the model save location
arguments (argparse.Namespace) – The arguments that were passed to the train or convert process as generated from Faceswap’s command line arguments
predict (bool) –
Trueif the model is being loaded for inference,Falseif the model is being loaded for training. Default:False
- add_history(loss: np.ndarray) None
Add the current iteration’s loss history to
_io.history.Called from the trainer after each iteration, for tracking loss drop over time between save iterations.
- Parameters:
loss (np.ndarray) – The loss values for the A and B side for the current iteration. This should be the collated loss values for each side.
- Return type:
None
- build() None
Build the model and assign to
model.Within the defined strategy scope, either builds the model from scratch or loads an existing model if one exists.
If running inference, then the model is built only for the required side to perform the swap function, otherwise the model is then compiled with the optimizer and chosen loss function(s).
Finally, a model summary is outputted to the logger at verbose level.
- Return type:
None
- build_model(inputs: list[Input]) Model
Override for Model Specific autoencoder builds.
- Parameters:
inputs (list[Input]) – A list of
keras.layers.Inputtensors. This will be a list of 2 tensors (one for each side) each of shapesinput_shape.- Returns:
See Keras documentation for the correct structure, but note that parameter
nameis a required rather than an optional argument in Faceswap. You should assign this to
the attribute
self.namethat is automatically generated from the plugin’s filename.
- Return type:
Model
- property coverage_ratio: float
The ratio of the training image to crop out and train on as defined in user configuration options.
NB: The coverage ratio is a raw float, but will be applied to integer pixel images.
To ensure consistent rounding and guaranteed even image size, the calculation for coverage should always be: \((original_size * coverage_ratio // 2) * 2\)
- property freeze_layers: list[str]
Override to set plugin specific layers that can be frozen. Defaults to [“encoder”]
- input_shape: tuple[int, ...]
A tuple of ints defining the shape of the faces that the model takes as input. This should be overridden by model plugins in their
__init__()function. If the input size is the same for both sides of the model, then this can be a single 3 dimensional tuple. If the inputs have different sizes for “A” and “B” this should be a list of 2 3 dimensional shape tuples, 1 for each side respectively.
- property input_shapes: list[tuple[None, int, int, int]]
A flattened list corresponding to all of the inputs to the model.
- property iterations: int
The total number of iterations that the model has trained.
- property load_layers: list[str]
Override to set plugin specific layers that can be loaded. Defaults to [“encoder”]
- property model: Model
The compiled model for this plugin.
- property model_name: str
The name of the keras model. Generally this will be the same as
namebut some plugins will override this when they contain multiple architectures
- property name: str
The name of this model based on the plugin name.
- property output_shapes: list[tuple[None, int, int, int]]
A flattened list corresponding to all of the outputs of the model.
Classes
|
Base class that all model plugins should inherit from. |
plugins.train.model._base.settings Module
Settings for the model base plugins.
The objects in this module should not be called directly, but are called from
ModelBase
- Handles configuration of model plugins for:
Optimizer settings
General global model configuration settings
- class plugins.train.model._base.settings.Settings(arguments: Namespace, mixed_precision: bool, is_predict: bool)
Core training settings.
Sets backend settings prior to launching the model.
- Parameters:
arguments (Namespace) – The arguments that were passed to the train or convert process as generated from Faceswap’s command line arguments
mixed_precision (bool) –
Trueif Mixed Precision training should be used otherwiseFalseis_predict (bool) –
Trueif the model is being loaded for inference,Falseif the model is being loaded for training. Default:False
- check_model_precision(model: keras.models.Model, state: State) keras.models.Model
Check the model’s precision.
If this is a new model, then Rewrite an existing model’s training precision mode from mixed-float16 to float32 or vice versa.
This is not easy to do in keras, so we edit the model’s config to change the dtype policy for compatible layers. Create a new model from this config, then port the weights from the old model to the new model.
- Parameters:
model (keras.models.Model) – The original saved keras model to rewrite the dtype
state (State) – The State information for the model
- Return type:
The original model with the datatype updated
- get_mixed_precision_layers(build_func: Callable[[list[keras.layers.Layer]], keras.models.Model], inputs: list[keras.layers.Layer]) tuple[keras.models.Model, list[str]]
Get and store the mixed precision layers from a full precision enabled model.
- Parameters:
build_func (Callable[[list[keras.layers.Layer]], keras.models.Model]) – The function to be called to compile the newly created model
inputs (list[keras.layers.Layer]) – The inputs to the model to be compiled
- Returns:
model – The built model in fp32
names – The list of layer names within the full precision model that can be switched to mixed precision
- Return type:
tuple[keras.models.Model, list[str]]
- classmethod loss_scale_optimizer(optimizer: Optimizer) LossScaleOptimizer
Optimize loss scaling for mixed precision training.
- Parameters:
optimizer (Optimizer) – The optimizer instance to wrap
- Return type:
The original optimizer with loss scaling applied
- property use_mixed_precision: bool
Trueif mixed precision training has been enabled, otherwiseFalse.
Classes
|
Core training settings. |
plugins.train.model._base.state Module
Handles the loading and saving of a model’s state file
- class plugins.train.model._base.state.State(model_dir: str, model_name: str, no_logs: bool)
Holds state information relating to the plugin’s saved model.
- Parameters:
model_dir (str) – The full path to the model save location
model_name (str) – The name of the model plugin
no_logs (bool) –
Trueif Tensorboard logs should not be generated, otherwiseFalse
- add_lr_finder(learning_rate: float) None
Add the optimal discovered learning rate from the learning rate finder
- Parameters:
learning_rate (float) – The discovered learning rate
- Return type:
None
- add_mixed_precision_layers(layers: list[str]) None
Add the list of model’s layers that are compatible for mixed precision to the state dictionary
- Parameters:
layers (list[str])
- Return type:
None
- add_session_batchsize(batch_size: int) None
Add the session batch size to the sessions dictionary.
- Parameters:
batch_size (int) – The batch size for the current training session
- Return type:
None
- property current_session: dict
The state dictionary for the current
session_id.- Type:
dict
- property filename: str
Full path to the state filename
- Type:
str
- increment_iterations() None
Increment
iterationsand session iterations by 1.- Return type:
None
- property iterations: int
The total number of iterations that the model has trained.
- Type:
int
- lowest_avg_loss: float
The lowest average loss seen between save intervals.
- Type:
float
- property lr_finder: float
The value discovered from the learning rate finder. -1 if no value stored
- property mixed_precision_layers: list[str]
Layers that can be switched between mixed-float16 and float32.
- Type:
list
- property model_needs_rebuild: bool
Trueif mixed precision policy has changed so model needs to be rebuilt otherwiseFalse- Type:
bool
- save() None
Save the state values to the serialized state file.
- Return type:
None
- property session_id: int
The current training session id.
- Type:
int
- property sessions: dict[int, dict[str, Any]]
The session information for each session in the state file
- Type:
dict[int, dict[str, Any]]
- update_session_config(key: str, value: Any) None
Update a configuration item of the currently loaded session.
- Parameters:
key (str) – The configuration item to update for the current session
value (any) – The value to update to
- Return type:
None
Classes
|
Holds state information relating to the plugin's saved model. |
plugins.train.model._base.update Module
Updating legacy faceswap models to the current version
- class plugins.train.model._base.update.Legacy(model_path: str)
Handles the updating of Keras 2.x models to Keras 3.x
Generally Keras 2.x models will open in Keras 3.x. There are a couple of bugs in Keras 3 legacy loading code which impacts Faceswap models: - When a model receives a shared functional model as an inbound node, the node index needs reducing by 1 (non-trivial to fix upstream) - Keras 3 does not accept nested outputs, so Keras 2 FS models need to have the outputs flattened
- Parameters:
model_path (str) – Full path to the legacy Keras 2.x model h5 file to upgrade
- class plugins.train.model._base.update.PatchKerasConfig(model_path: str)
This class exists to patch breaking changes when moving from older keras 3.x models to newer versions
- Parameters:
model_path (str) – Full path to the keras model to be patched for the current version
- __call__() None
Update the keras configuration saved in a keras model file and save over the original model
- Return type:
None
Classes
|
Handles the updating of Keras 2.x models to Keras 3.x |
|
This class exists to patch breaking changes when moving from older keras 3.x models to newer versions |
plugins.train.model.original Module
Original Model Based on the original https://www.reddit.com/r/deepfakes/ code sample + contributions.
This model is heavily documented as it acts as a template that other model plugins can be developed from.
- class plugins.train.model.original.Model(*args, **kwargs)
Original Faceswap Model.
This is the original faceswap model and acts as a template for plugin development.
All plugins must define the following attribute override after calling the parent’s
__init__()method:input_shape(tuple or list): a tuple of ints defining the shape of the faces that the model takes as input. If the input size is the same for both sides, this can be a single 3 dimensional tuple. If the inputs have different sizes for “A” and “B” this should be a list of 2 3 dimensional shape tuples, 1 for each side.
Any additional attributes used exclusively by this model should be defined here, but make sure that you are not accidentally overriding any existing
ModelBaseattributes.- Parameters:
- build_model(inputs)
Create the model’s structure.
This function is automatically called immediately after
__init__()has been called if a new model is being created. It is ignored if an existing model is being loaded from disk as the model structure will be defined in the saved model file.The model’s final structure is defined here.
For the original model, An encoder instance is defined, then the same instance is referenced twice, one for each input “A” and “B” so that the same model is used for both inputs.
2 Decoders are then defined (one for each side) with the encoder instances passed in as input to the corresponding decoders.
The final output of the model should always call
lib.model.nn_blocks.Conv2DOutputso that the correct data type is set for the final activation, to support Mixed Precision Training. Failure to do so is likely to lead to issues when Mixed Precision is enabled.- Parameters:
inputs (list) – A list of input tensors for the model. This will be a list of 2 tensors of shape
input_shape, the first for side “a”, the second for side “b”.- Returns:
See Keras documentation for the correct structure, but note that parameter
nameis a required rather than an optional argument in Faceswap. You should assign this to the attributeself.namethat is automatically generated from the plugin’s filename.- Return type:
keras.models.Model
- decoder(side)
The original Faceswap Decoder Network.
The decoders for the original model have separate weights for each side “A” and “B”, so two instances are created in
build_model(), one for each side.- Parameters:
side (str) – Either “a or “b”. This is used for naming the decoder model.
- Returns:
The Keras decoder model. This will be called twice, once for each side.
- Return type:
keras.models.Model
- encoder()
The original Faceswap Encoder Network.
The encoder for the original model has it’s weights shared between both the “A” and “B” side of the model, so only one instance is created
build_model(). However this same instance is then used twice (once for A and once for B) meaning that the weights get shared.- Returns:
The Keras encoder model, for sharing between inputs from both sides.
- Return type:
keras.models.Model
Classes
|
Original Faceswap Model. |
Class Inheritance Diagram

trainer package
This package contains the training loop for Faceswap
plugins.train.trainer.base Module
Base Class for Faceswap Trainer plugins. All Trainer plugins should be inherited from this class.
At present there is only the original plugin, so that entirely
inherits from this class. If further plugins are developed, then common code should be kept here,
with “original” unique code split out to the original plugin.
- class plugins.train.trainer.base.TrainConfig(folders: list[str], batch_size: int, augment_color: bool, flip: bool, warp: bool, cache_landmarks: bool, lr_finder: bool = False, snapshot_interval: int = -1)
Configuration for training a model
- Parameters:
image_folders – List of folders to be used as inputs to the model. Folders are provided in processing order (eg: [A, B, …])
batch_size (int) – The batch size to load data from each of the loaders
augment_color (bool) –
Trueto perform color augmentation otherwiseFalseflip (bool) –
Trueto perform image flipping otherwiseFalsewarp (bool) –
Falseto disable warpingTrueto enable warpingcache_landmarks (bool) –
Trueto cache landmarks from the other side for Warp to landmarksuse_lr_finder –
Trueto use the learning rate finder. Default:Falseinterval (snapshot) – The number of iterations between snapshots. Default -1 (Disabled)
folders (list[str])
lr_finder (bool)
snapshot_interval (int)
- augment_color: bool = <dataclasses._MISSING_TYPE object>
Trueto perform color augmentation otherwiseFalse
- batch_size: int = <dataclasses._MISSING_TYPE object>
The batch size to load data from each of the loaders
- cache_landmarks: bool = <dataclasses._MISSING_TYPE object>
Trueto cache landmarks from the other side for Warp to landmarks
- flip: bool = <dataclasses._MISSING_TYPE object>
Falseto disable warpingTrueto enable warping
- folders: list[str] = <dataclasses._MISSING_TYPE object>
List of folders to be used as inputs to the model. Folders are provided in processing order (eg: [A, B, …])
- lr_finder: bool = False
Trueto use the learning rate finder
- snapshot_interval: int = -1
The number of iterations between snapshots
- warp: bool = <dataclasses._MISSING_TYPE object>
Falseto disable warpingTrueto enable warping
- class plugins.train.trainer.base.TrainerBase(model: ModelBase, config: TrainConfig)
A trainer plugin interface. It must implement the method “train_batch” which takes an input of inputs to the model and target images for model output. It returns loss per side
- Parameters:
model (ModelBase) – The model plugin
config (TrainConfig) – The Training Configuration options
- batch_size
The batch size for each iteration to be trained through the model.
- config
Training configuration options
- abstractmethod get_sampler() type[RandomSampler | DistributedSampler]
Override to set the sampler that the Torch DataLoader should use
- Return type:
The sampler that the torch DataLoader should use
- loss_func: LossCollator
The selected loss functions for the model
- model
The model plugin to train the batch on
- register_loss(loss: LossCollator) None
Registers the selected loss functions to the underlying model nn.module
- Parameters:
loss (LossCollator) – The configured loss functions
- Return type:
None
- sampler
The data sampler that the data loader should use
- abstractmethod train_batch(inputs: list[torch.Tensor], targets: list[torch.Tensor], optimizer: Optimizer, meta: BatchMeta) list[BatchLoss]
Override to run a single forward and backwards pass through the model for a single batch
- Parameters:
inputs (list[torch.Tensor]) – The batch of input image tensors to the model of length(num inputs)
targets (list[torch.Tensor]) – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range
optimizer (Optimizer) – The configured Optimizer to use
meta (BatchMeta) – The meta information for the batch
- Return type:
The loss for each input to the model in order (A, B, …)
Classes
|
Configuration for training a model |
|
A trainer plugin interface. |
Variables
A standard |
plugins.train.trainer.distributed Module
Original Trainer
- class plugins.train.trainer.distributed.Trainer(model: ModelBase, config: TrainConfig)
Distributed training with torch.nn.DataParallel
- Parameters:
model (ModelBase) – The model that will be running this trainer
config (TrainConfig) – The Training Configuration options
- class plugins.train.trainer.distributed.WrappedModel(model: keras.Model)
A torch module that wraps a dual input Faceswap model with a single input version that is compatible with DataParallel training
- Parameters:
model (keras.Model) – The original faceswap model that is to be wrapped
- forward(inputs: list[Tensor], targets: list[Tensor], meta_dict: dict[str, list[Tensor]]) list[dict]
Run the forward pass per GPU
- Parameters:
inputs (list[Tensor]) – The batch of input image tensors to the model of length(num inputs)
targets (list[Tensor]) – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range
meta_dict (dict[str, list[Tensor]]) – The meta information for the batch in dictionary form
- Return type:
The loss outputs for each side of the model for 1 GPU
Classes
|
Distributed training with torch.nn.DataParallel |
|
A torch module that wraps a dual input Faceswap model with a single input version that is compatible with DataParallel training |
Class Inheritance Diagram

plugins.train.trainer.original Module
Original Trainer
- class plugins.train.trainer.original.Trainer(model: ModelBase, config: TrainConfig)
Original trainer
- Parameters:
model (ModelBase)
config (TrainConfig)
- get_sampler() type[RandomSampler]
Obtain a standard random sampler
- Return type:
The Random sampler
- train_batch(inputs: list[torch.Tensor], targets: list[torch.Tensor], optimizer: Optimizer, meta: BatchMeta) list[BatchLoss]
Run a single forward and backwards pass through the model for a single batch
- Parameters:
inputs (list[torch.Tensor]) – The batch of input image tensors to the model of length(num inputs)
targets (list[torch.Tensor]) – List of len (num_outputs) of target images in shape (batch_size, num_inputs, height, width, 3) at all model output sizes as float32 0.0 - 1.0 range
optimizer (Optimizer) – The configured Optimizer to use
meta (BatchMeta) – The meta information for the batch
- Return type:
The loss for each input to the model in order (A, B, …)
Classes
|
Original trainer |
Class Inheritance Diagram

plugins.train.trainer.trainer_config Module
Default configurations for trainers
- class plugins.train.trainer.trainer_config.Augmentation(helptext: str)
trainer.augmentation section
- Parameters:
helptext (str)
- class plugins.train.trainer.trainer_config.Loader(helptext: str)
trainer.loader section
- Parameters:
helptext (str)
- plugins.train.trainer.trainer_config.get_defaults() dict[str, GlobalSection]
Obtain the default values for adding to the config.ini file
- Returns:
The option names and config items
- Return type:
defaults
Functions
Obtain the default values for adding to the config.ini file |
Classes
|
trainer.augmentation section |
|
trainer.loader section |
Class Inheritance Diagram
