align package

The align Package handles detected faces, their alignments and masks.

aligned_face module

Handles aligned faces and corresponding pose estimates

Module Summary

AlignedFace

Class to align a face.

get_matrix_scaling

Given a matrix, return the cv2 Interpolation method and inverse interpolation method for applying the matrix on an image.

transform_image

Perform transformation on an image, applying the given size and padding to the matrix.

Module

Aligner for faceswap.py

class lib.align.aligned_face.AlignedFace(landmarks: ndarray, image: ndarray | None = None, centering: Literal['face', 'head', 'legacy'] = 'face', size: int = 64, coverage_ratio: float = 1.0, dtype: str | None = None, is_aligned: bool = False, is_legacy: bool = False)

Bases: object

Class to align a face.

Holds the aligned landmarks and face image, as well as associated matrices and information about an aligned face.

Parameters:
  • landmarks (numpy.ndarray) – The original 68 point landmarks that pertain to the given image for this face

  • image (numpy.ndarray, optional) – The original frame that contains the face that is to be aligned. Pass None if the aligned face is not to be generated, and just the co-ordinates should be calculated.

  • centering (["legacy", "face", "head"], optional) – The type of extracted face that should be loaded. “legacy” places the nose in the center of the image (the original method for aligning). “face” aligns for the nose to be in the center of the face (top to bottom) but the center of the skull for left to right. “head” aligns for the center of the skull (in 3D space) being the center of the extracted image, with the crop holding the full head. Default: “face”

  • size (int, optional) – The size in pixels, of each edge of the final aligned face. Default: 64

  • coverage_ratio (float, optional) – The amount of the aligned image to return. A ratio of 1.0 will return the full contents of the aligned image. A ratio of 0.5 will return an image of the given size, but will crop to the central 50%% of the image.

  • dtype (str, optional) – Set a data type for the final face to be returned as. Passing None will return a face with the same data type as the original image. Default: None

  • is_aligned_face (bool, optional) – Indicates that the image is an aligned face rather than a frame. Default: False

  • is_legacy (bool, optional) – Only used if is_aligned is True. True indicates that the aligned image being loaded is a legacy extracted face rather than a current head extracted face

property adjusted_matrix: ndarray

The 3x2 transformation matrix for extracting and aligning the core face area out of the original frame with padding and sizing applied.

Type:

numpy.ndarray

property average_distance: float

The average distance of the core landmarks (18-67) from the mean face that was used for aligning the image.

Type:

float

property centering: Literal['legacy', 'head', 'face']

The centering of the Aligned Face. One of “legacy”, “head”, “face”.

Type:

str

extract_face(image: ndarray | None) ndarray | None

Extract the face from a source image and populate face. If an image is not provided then None is returned.

Parameters:

image (numpy.ndarray or None) – The original frame to extract the face from. None if the face should not be extracted

Returns:

The extracted face at the given size, with the given coverage of the given dtype or None if no image has been provided.

Return type:

numpy.ndarray or None

property face: ndarray | None

The aligned face at the given size at the specified coverage in the given dtype. If an image has not been provided then an the attribute will return None.

Type:

numpy.ndarray

get_cropped_roi(image_size: int, target_size: int, centering: Literal['face', 'head', 'legacy']) ndarray

Obtain the region of interest within an aligned face set to centered coverage for an alternative centering

Parameters:
  • image_size (int) – The size of the full head extracted image loaded from disk

  • target_size (int) – The size of the target centered face with coverage ratio applied in relation to the original image size

  • centering (["legacy", "face"]) – The type of centering to obtain the region of interest for. “legacy” places the nose in the center of the image (the original method for aligning). “face” aligns for the nose to be in the center of the face (top to bottom) but the center of the skull for left to right.

Returns:

The (left, top, right, bottom location of the region of interest within an aligned face centered on the head for the given centering

Return type:

numpy.ndarray

property interpolators: tuple[int, int]

(interpolator and reverse interpolator) for the adjusted matrix.

Type:

tuple

property landmark_type: LandmarkType

The type of landmarks that generated this aligned face

Type:

LandmarkType

property landmarks: ndarray

The 68 point facial landmarks aligned to the extracted face box.

Type:

numpy.ndarray

property matrix: ndarray

The 3x2 transformation matrix for extracting and aligning the core face area out of the original frame, with no padding or sizing applied. The returned matrix is offset for the given centering.

Type:

numpy.ndarray

property normalized_landmarks: ndarray

The 68 point facial landmarks normalized to 0.0 - 1.0 as aligned by Umeyama.

Type:

numpy.ndarray

property original_roi: ndarray

The location of the extracted face box within the original frame.

Type:

numpy.ndarray

property padding: int

The amount of padding (in pixels) that is applied to each side of the extracted face image for the selected extract type.

Type:

int

property pose: PoseEstimate

The estimated pose in 3D space.

Type:

lib.align.PoseEstimate

property relative_eye_mouth_position: float

Value representing the relative position of the lowest eye/eye-brow point to the highest mouth point. Positive values indicate that eyes/eyebrows are aligned above the mouth, negative values indicate that eyes/eyebrows are misaligned below the mouth.

Type:

float

property size: int

The size (in pixels) of one side of the square extracted face image.

Type:

int

split_mask() ndarray

Remove the mask from the alpha channel of face and return the mask

Returns:

The mask that was stored in the face’s alpha channel

Return type:

numpy.ndarray

Raises:

AssertionError – If face does not contain a mask in the alpha channel

transform_points(points: ndarray, invert: bool = False) ndarray

Perform transformation on a series of (x, y) co-ordinates in world space into aligned face space.

Parameters:
  • points (numpy.ndarray) – The points to transform

  • invert (bool, optional) – True to reverse the transformation (i.e. transform the points into world space from aligned face space). Default: False

Returns:

The transformed points

Return type:

numpy.ndarray

lib.align.aligned_face.get_adjusted_center(image_size: int, source_offset: ndarray, target_offset: ndarray, source_centering: Literal['face', 'head', 'legacy']) ndarray

Obtain the correct center of a face extracted image to translate between two different extract centerings.

Parameters:
  • image_size (int) – The size of the image at the given source_centering

  • source_offset (numpy.ndarray) – The pose offset to translate a base extracted face to source centering

  • target_offset (numpy.ndarray) – The pose offset to translate a base extracted face to target centering

  • source_centering (["face", "head", "legacy"]) – The centering of the source image

Returns:

The center point of the image at the given size for the target centering

Return type:

numpy.ndarray

lib.align.aligned_face.get_centered_size(source_centering: Literal['face', 'head', 'legacy'], target_centering: Literal['face', 'head', 'legacy'], size: int, coverage_ratio: float = 1.0) int

Obtain the size of a cropped face from an aligned image.

Given an image of a certain dimensions, returns the dimensions of the sub-crop within that image for the requested centering at the requested coverage ratio

Notes

“legacy” places the nose in the center of the image (the original method for aligning). “face” aligns for the nose to be in the center of the face (top to bottom) but the center of the skull for left to right. “head” places the center in the middle of the skull in 3D space.

The ROI in relation to the source image is calculated by rounding the padding of one side to the nearest integer then applying this padding to the center of the crop, to ensure that any dimensions always have an even number of pixels.

Parameters:
  • source_centering (["head", "face", "legacy"]) – The centering that the original image is aligned at

  • target_centering (["head", "face", "legacy"]) – The centering that the sub-crop size should be obtained for

  • size (int) – The size of the source image to obtain the cropped size for

  • coverage_ratio (float, optional) – The coverage ratio to be applied to the target image. Default: 1.0

Returns:

The pixel size of a sub-crop image from a full head aligned image with the given coverage ratio

Return type:

int

lib.align.aligned_face.get_matrix_scaling(matrix: ndarray) tuple[int, int]

Given a matrix, return the cv2 Interpolation method and inverse interpolation method for applying the matrix on an image.

Parameters:

matrix (numpy.ndarray) – The transform matrix to return the interpolator for

Returns:

The interpolator and inverse interpolator for the given matrix. This will be (Cubic, Area) for an upscale matrix and (Area, Cubic) for a downscale matrix

Return type:

tuple

lib.align.aligned_face.transform_image(image: ndarray, matrix: ndarray, size: int, padding: int = 0) ndarray

Perform transformation on an image, applying the given size and padding to the matrix.

Parameters:
  • image (numpy.ndarray) – The image to transform

  • matrix (numpy.ndarray) – The transformation matrix to apply to the image

  • size (int) – The final size of the transformed image

  • padding (int, optional) – The amount of padding to apply to the final image. Default: 0

Returns:

The transformed image

Return type:

numpy.ndarray

aligned_mask module

Handles aligned storage and retrieval of Faceswap generated masks

Module Summary

BlurMask

Factory class to return the correct blur object for requested blur type.

LandmarksMask

Create a single channel mask from aligned landmark points.

Mask

Face Mask information and convenience methods

Module

Handles retrieval and storage of Faceswap aligned masks

class lib.align.aligned_mask.BlurMask(blur_type: Literal['gaussian', 'normalized'], mask: ndarray, kernel: int | float, is_ratio: bool = False, passes: int = 1)

Bases: object

Factory class to return the correct blur object for requested blur type.

Works for square images only. Currently supports Gaussian and Normalized Box Filters.

Parameters:
  • blur_type (["gaussian", "normalized"]) – The type of blur to use

  • mask (numpy.ndarray) – The mask to apply the blur to

  • kernel (int or float) – Either the kernel size (in pixels) or the size of the kernel as a ratio of mask size

  • is_ratio (bool, optional) – Whether the given kernel parameter is a ratio or not. If True then the actual kernel size will be calculated from the given ratio and the mask size. If False then the kernel size will be set directly from the kernel parameter. Default: False

  • passes (int, optional) – The number of passes to perform when blurring. Default: 1

Example

>>> print(mask.shape)
(128, 128, 1)
>>> new_mask = BlurMask("gaussian", mask, 3, is_ratio=False, passes=1).blurred
>>> print(new_mask.shape)
(128, 128, 1)
property blurred: ndarray

The final mask with blurring applied.

Type:

numpy.ndarray

class lib.align.aligned_mask.LandmarksMask(points: list[np.ndarray], storage_size: int = 128, storage_centering: CenteringType = 'face', dilation: float = 0.0)

Bases: Mask

Create a single channel mask from aligned landmark points.

Landmarks masks are created on the fly, so the stored centering and size should be the same as the aligned face that the mask will be applied to. As the masks are created on the fly, blur + dilation is applied to the mask at creation (prior to compression) rather than after decompression when requested.

Note

Threshold is not used for Landmarks mask as the mask is binary

Parameters:
  • points (list) – A list of landmark points that correspond to the given storage_size to create the mask. Each item in the list should be a numpy.ndarray that a filled convex polygon will be created from

  • storage_size (int, optional) – The size (in pixels) that the compressed mask should be stored at. Default: 128.

  • storage_centering – The centering to store the mask at. One of “legacy”, “face”, “head”. Default: “face”

  • (optional) (str) – The centering to store the mask at. One of “legacy”, “face”, “head”. Default: “face”

  • dilation (float, optional) – The amount of dilation to apply to the mask. as a percentage of the mask size. Default: 0.0

generate_mask(affine_matrix: ndarray, interpolator: int) None

Generate the mask.

Creates the mask applying any requested dilation and blurring and assigns compressed mask to _mask

Parameters:
  • affine_matrix (numpy.ndarray) – The transformation matrix required to transform the mask to the original frame.

  • interpolator – The CV2 interpolator required to transform this mask to it’s original frame

  • int – The CV2 interpolator required to transform this mask to it’s original frame

property mask: ndarray

Overrides the default mask property, creating the processed mask at first call and compressing it. The decompressed mask is returned from this property.

Type:

numpy.ndarray

class lib.align.aligned_mask.Mask(storage_size: int = 128, storage_centering: CenteringType = 'face')

Bases: object

Face Mask information and convenience methods

Holds a Faceswap mask as generated from plugins.extract.mask and the information required to transform it to its original frame.

Holds convenience methods to handle the warping, storing and retrieval of the mask.

Parameters:
  • storage_size (int, optional) – The size (in pixels) that the mask should be stored at. Default: 128.

  • storage_centering – The centering to store the mask at. One of “legacy”, “face”, “head”. Default: “face”

  • (optional) (str) – The centering to store the mask at. One of “legacy”, “face”, “head”. Default: “face”

stored_size

The size, in pixels, of the stored mask across its height and width.

Type:

int

stored_centering

The centering that the mask is stored at. One of “legacy”, “face”, “head”

Type:

str

add(mask: ndarray, affine_matrix: ndarray, interpolator: int) None

Add a Faceswap mask to this Mask.

The mask should be the original output from plugins.extract.mask

Parameters:
  • mask (numpy.ndarray) – The mask that is to be added as output from plugins.extract.mask It should be in the range 0.0 - 1.0 ideally with a dtype of float32

  • affine_matrix (numpy.ndarray) – The transformation matrix required to transform the mask to the original frame.

  • interpolator – The CV2 interpolator required to transform this mask to it’s original frame

  • int – The CV2 interpolator required to transform this mask to it’s original frame

property affine_matrix: ndarray

numpy.ndarray: The affine matrix to transpose the mask to a full frame.

Type:

class

from_dict(mask_dict: MaskAlignmentsFileDict) None

Populates the Mask from a dictionary loaded from an alignments file.

Parameters:

mask_dict (dict) – A dictionary stored in an alignments file containing the keys mask, affine_matrix, interpolator, stored_size, stored_centering

get_full_frame_mask(width: int, height: int) ndarray

Return the stored mask in a full size frame of the given dimensions

Parameters:
  • width (int) – The width of the original frame that the mask was extracted from

  • height (int) – The height of the original frame that the mask was extracted from

Returns:

:class:`numpy.ndarray`

Return type:

The mask affined to the original full frame of the given dimensions

property interpolator: int

The cv2 interpolator required to transpose the mask to a full frame.

Type:

int

property mask: ndarray

The mask at the size of stored_size with any requested blurring, threshold amount and centering applied.

Type:

numpy.ndarray

property original_roi: ndarray

numpy.ndarray: The original region of interest of the mask in the source frame.

Type:

class

replace_mask(mask: ndarray) None

Replace the existing _mask with the given mask.

Parameters:

mask (numpy.ndarray) – The mask that is to be added as output from plugins.extract.mask. It should be in the range 0.0 - 1.0 ideally with a dtype of float32

set_blur_and_threshold(blur_kernel: int = 0, blur_type: Literal['gaussian', 'normalized'] | None = 'gaussian', blur_passes: int = 1, threshold: int = 0) None

Set the internal blur kernel and threshold amount for returned masks

Parameters:
  • blur_kernel (int, optional) – The kernel size, in pixels to apply gaussian blurring to the mask. Set to 0 for no blurring. Should be odd, if an even number is passed in (outside of 0) then it is rounded up to the next odd number. Default: 0

  • blur_type (["gaussian", "normalized"], optional) – The blur type to use. gaussian or normalized box filter. Default: gaussian

  • blur_passes (int, optional) – The number of passed to perform when blurring. Default: 1

  • threshold (int, optional) – The threshold amount to minimize/maximize mask values to 0 and 100. Percentage value. Default: 0

set_dilation(amount: float) None

Set the internal dilation object for returned masks

Parameters:

amount (float) – The amount of erosion/dilation to apply as a percentage of the total mask size. Negative values erode the mask. Positive values dilate the mask

set_sub_crop(source_offset: np.ndarray, target_offset: np.ndarray, centering: CenteringType, coverage_ratio: float = 1.0) None

Set the internal crop area of the mask to be returned.

This impacts the returned mask from mask if the requested mask is required for different face centering than what has been stored.

Parameters:
  • source_offset (numpy.ndarray) – The (x, y) offset for the mask at its stored centering

  • target_offset (numpy.ndarray) – The (x, y) offset for the mask at the requested target centering

  • centering (str) – The centering to set the sub crop area for. One of “legacy”, “face”. “head”

  • coverage_ratio (float, optional) – The coverage ratio to be applied to the target image. None for default (1.0). Default: None

property stored_mask: ndarray

The mask at the size of stored_size as it is stored (i.e. with no blurring/centering applied).

Type:

numpy.ndarray

to_dict(is_png=False) MaskAlignmentsFileDict

Convert the mask to a dictionary for saving to an alignments file

Parameters:

is_png (bool) – True if the dictionary is being created for storage in a png header otherwise False. Default: False

Returns:

The Mask for saving to an alignments file. Contains the keys mask, affine_matrix, interpolator, stored_size, stored_centering

Return type:

dict

to_png_meta() MaskAlignmentsFileDict

Convert the mask to a dictionary supported by png itxt headers.

Returns:

The Mask for saving to an alignments file. Contains the keys mask, affine_matrix, interpolator, stored_size, stored_centering

Return type:

dict

alignments module

Handles alignments stored in a serialized alignments.fsa file

Module Summary

Alignments

The alignments file is a custom serialized .fsa file that holds information for each frame for a video or series of images.

Thumbnails

Thumbnail images stored in the alignments file.

Module

Alignments file functions for reading, writing and manipulating the data stored in a serialized alignments file.

class lib.align.alignments.AlignmentDict

Bases: TypedDict

Dictionary for holding all of the alignment information within a single alignment file

faces: list[lib.align.alignments.AlignmentFileDict]
video_meta: dict[str, float | int]
class lib.align.alignments.AlignmentFileDict

Bases: dict

Typed Dictionary for storing a single faces’ Alignment Information in alignments files.

h: int
identity: dict[str, list[float]]
landmarks_xy: list[float] | ndarray
mask: dict[str, lib.align.alignments.MaskAlignmentsFileDict]
thumb: ndarray | None
w: int
x: int
y: int
class lib.align.alignments.Alignments(folder: str, filename: str = 'alignments')

Bases: object

The alignments file is a custom serialized .fsa file that holds information for each frame for a video or series of images.

Specifically, it holds a list of faces that appear in each frame. Each face contains information detailing their detected bounding box location within the frame, the 68 point facial landmarks and any masks that have been extracted.

Additionally it can also hold video meta information (timestamp and whether a frame is a key frame.)

Parameters:
  • folder (str) – The folder that contains the alignments .fsa file

  • filename (str, optional) – The filename of the .fsa alignments file. If not provided then the given folder will be checked for a default alignments file filename. Default: “alignments”

add_face(frame_name: str, face: AlignmentFileDict) int

Add a new face for the given frame_name in data and return it’s index.

Parameters:
  • frame_name (str) – The frame name to add the face to. This should be the base name of the frame, not the full path

  • face (dict) – The face information to add to the given frame_name, correctly formatted for storing in data

Returns:

The index of the newly added face within data for the given frame_name

Return type:

int

backup() None

Create a backup copy of the alignments file.

Creates a copy of the serialized alignments file appending a timestamp onto the end of the file name and storing in the same folder as the original file.

count_faces_in_frame(frame_name: str) int

Return number of faces that appear within data for the given frame_name.

Parameters:

frame_name (str) – The frame name to return the count for. This should be the base name of the frame, not the full path

Returns:

The number of faces that appear in the given frame_name

Return type:

int

property data: dict[str, lib.align.alignments.AlignmentDict]

The loaded alignments file in dictionary form.

Type:

dict

delete_face_at_index(frame_name: str, face_index: int) bool

Delete the face for the given frame_name at the given face index from data.

Parameters:
  • frame_name (str) – The frame name to remove the face from. This should be the base name of the frame, not the full path

  • face_index (int) – The index number of the face within the given frame_name to remove

Returns:

True if a face was successfully deleted otherwise False

Return type:

bool

property faces_count: int

The total number of faces that appear in the alignments data.

Type:

int

property file: str

The full path to the currently loaded alignments file.

Type:

str

filter_faces(filter_dict: dict[str, list[int]], filter_out: bool = False) None

Remove faces from data based on a given filter list.

Parameters:
  • filter_dict (dict) – Dictionary of source filenames as key with a list of face indices to filter as value.

  • filter_out (bool, optional) – True if faces should be removed from data when there is a corresponding match in the given filter_dict. False if faces should be kept in data when there is a corresponding match in the given filter_dict, but removed if there is no match. Default: False

frame_exists(frame_name: str) bool

Check whether a given frame_name exists within the alignments data.

Parameters:

frame_name (str) – The frame name to check. This should be the base name of the frame, not the full path

Returns:

True if the given frame_name exists within the alignments data otherwise False

Return type:

bool

frame_has_faces(frame_name: str) bool

Check whether a given frame_name exists within the alignments data and contains at least 1 face.

Parameters:

frame_name (str) – The frame name to check. This should be the base name of the frame, not the full path

Returns:

True if the given frame_name exists within the alignments data and has at least 1 face associated with it, otherwise False

Return type:

bool

frame_has_multiple_faces(frame_name: str) bool

Check whether a given frame_name exists within the alignments data and contains more than 1 face.

Parameters:

frame_name (str) – The frame_name name to check. This should be the base name of the frame, not the full path

Returns:

True if the given frame_name exists within the alignments data and has more than 1 face associated with it, otherwise False

Return type:

bool

property frames_count: int

The number of frames that appear in the alignments data.

Type:

int

get_faces_in_frame(frame_name: str) list[lib.align.alignments.AlignmentFileDict]

Obtain the faces from data associated with a given frame_name.

Parameters:

frame_name (str) – The frame name to return faces for. This should be the base name of the frame, not the full path

Returns:

The list of face dictionaries that appear within the requested frame_name

Return type:

list

property hashes_to_alignment: dict[str, lib.align.alignments.AlignmentFileDict]

The SHA1 hash of the face mapped to the alignment for the face that the hash corresponds to. The structure of the dictionary is:

Notes

This method is depractated and exists purely for updating legacy hash based alignments to new png header storage in lib.align.update_legacy_png_header.

Type:

dict

property hashes_to_frame: dict[str, dict[str, int]]

The SHA1 hash of the face mapped to the frame(s) and face index within the frame that the hash corresponds to.

Notes

This method is depractated and exists purely for updating legacy hash based alignments to new png header storage in lib.align.update_legacy_png_header.

Type:

dict

property have_alignments_file: bool

True if an alignments file exists at location file otherwise False.

Type:

bool

mask_is_valid(mask_type: str) bool

Ensure the given mask_type is valid for the alignments data.

Every face in the alignments data must have the given mask type to successfully pass the test.

Parameters:

mask_type (str) – The mask type to check against the current alignments data

Returns:

True if all faces in the current alignments possess the given mask_type otherwise False

Return type:

bool

property mask_summary: dict[str, int]

The mask type names stored in the alignments data as key with the number of faces which possess the mask type as value.

Type:

dict

save() None

Write the contents of data and _meta to a serialized .fsa file at the location file.

save_video_meta_data(pts_time: list[float], keyframes: list[int]) None

Save video meta data to the alignments file.

If the alignments file does not have an entry for every frame (e.g. if Extract Every N was used) then the frame is added to the alignments file with no faces, so that they video meta data can be stored.

Parameters:
  • pts_time (list) – A list of presentation timestamps (float) in frame index order for every frame in the input video

  • keyframes (list) – A list of frame indices corresponding to the key frames in the input video

property thumbnails: Thumbnails

The low resolution thumbnail images that exist within the alignments file

Type:

Thumbnails

update_face(frame_name: str, face_index: int, face: AlignmentFileDict) None

Update the face for the given frame_name at the given face index in data.

Parameters:
  • frame_name (str) – The frame name to update the face for. This should be the base name of the frame, not the full path

  • face_index (int) – The index number of the face within the given frame_name to update

  • face (dict) – The face information to update to the given frame_name at the given face_index, correctly formatted for storing in data

update_from_dict(data: dict[str, lib.align.alignments.AlignmentDict]) None

Replace all alignments with the contents of the given dictionary

Parameters:

data (dict[str, AlignmentDict]) – The alignments, in correctly formatted dictionary form, to be populated into this Alignments

update_legacy_has_source(filename: str) None

Update legacy alignments files when we have the source filename available.

Updates here can only be performed when we have the source filename

Parameters:

filename (str:) – The filename/folder of the original source images/video for the current alignments

property version: float

The alignments file version number.

Type:

float

property video_meta_data: dict[str, list[int] | list[float] | None]

The frame meta data stored in the alignments file. If data does not exist in the alignments file then None is returned for each Key

Type:

dict

yield_faces() Generator[tuple[str, list[AlignmentFileDict], int, str], None, None]

Generator to obtain all faces with meta information from data. The results are yielded by frame.

Notes

The yielded order is non-deterministic.

Yields:
  • frame_name (str) – The frame name that the face belongs to. This is the base name of the frame, as it appears in data, not the full path

  • faces (list) – The list of face dict objects that exist for this frame

  • face_count (int) – The number of faces that exist within data for this frame

  • frame_fullname (str) – The full path (folder and filename) for the yielded frame

class lib.align.alignments.MaskAlignmentsFileDict

Bases: TypedDict

Typed Dictionary for storing Masks.

affine_matrix: list[float] | np.ndarray
interpolator: int
mask: bytes
stored_centering: CenteringType
stored_size: int
class lib.align.alignments.PNGHeaderAlignmentsDict

Bases: TypedDict

Base Dictionary for storing a single faces’ Alignment Information in Alignments files and PNG Headers.

h: int
identity: dict[str, list[float]]
landmarks_xy: list[float] | ndarray
mask: dict[str, lib.align.alignments.MaskAlignmentsFileDict]
w: int
x: int
y: int
class lib.align.alignments.PNGHeaderDict

Bases: TypedDict

Dictionary for storing all alignment and meta information in PNG Headers

alignments: PNGHeaderAlignmentsDict
source: PNGHeaderSourceDict
class lib.align.alignments.PNGHeaderSourceDict

Bases: TypedDict

Dictionary for storing additional meta information in PNG headers

alignments_version: float
face_index: int
original_filename: str
source_filename: str
source_frame_dims: tuple[int, int] | None
source_is_video: bool

constants module

Holds various constants for use in generating and manipulating aligned face images

Constants that are required across faceswap’s lib.align package

lib.align.constants.EXTRACT_RATIOS: dict[Literal['face', 'head', 'legacy'], float] = {'face': 0.5, 'head': 0.625, 'legacy': 0.375}

The amount of padding applied to each centering type when generating aligned faces

Type:

dict[Literal[“legacy”, “face”, head”] float]

lib.align.constants.LANDMARK_PARTS: dict[lib.align.constants.LandmarkType, dict[str, tuple[int, int, bool]]] = {<LandmarkType.LM_2D_68: 3>: {'mouth_outer': (48, 60, True), 'mouth_inner': (60, 68, True), 'right_eyebrow': (17, 22, False), 'left_eyebrow': (22, 27, False), 'right_eye': (36, 42, True), 'left_eye': (42, 48, True), 'nose': (27, 36, False), 'jaw': (0, 17, False), 'chin': (8, 11, False)}, <LandmarkType.LM_2D_4: 1>: {'face': (0, 4, True)}}

For each landmark type, stores the (start index, end index, is polygon) information about each part of the face.

Type:

dict[LandmarkType, dict[str, tuple[int, int, bool]]

class lib.align.constants.LandmarkType(value)

Bases: Enum

Enumeration for the landmark types that Faceswap supports

LM_2D_4 = 1
LM_2D_51 = 2
LM_2D_68 = 3
LM_3D_26 = 4
classmethod from_shape(shape: tuple[int, ...]) LandmarkType

The landmark type for a given shape

Parameters:

shape (tuple[int, ...]) – The shape to get the landmark type for

Returns:

The enum for the given shape

Return type:

Type[LandmarkType]

Raises:

ValueError – If the requested shape is not valid

detected_face module

Handles detected face objects and their associated masks.

Module Summary

DetectedFace

Detected face and landmark information

update_legacy_png_header

Update a legacy extracted face from pre v2.1 alignments by placing the alignment data for the face in the png exif header for the given filename with the given alignment data.

Module

Face and landmarks detection for faceswap.py

class lib.align.detected_face.DetectedFace(image: ndarray | None = None, left: int | None = None, width: int | None = None, top: int | None = None, height: int | None = None, landmarks_xy: ndarray | None = None, mask: dict[str, lib.align.aligned_mask.Mask] | None = None)

Bases: object

Detected face and landmark information

Holds information about a detected face, it’s location in a source image and the face’s 68 point landmarks.

Methods for aligning a face are also callable from here.

Parameters:
  • image (numpy.ndarray, optional) – Original frame that holds this face. Optional (not required if just storing coordinates)

  • left (int) – The left most point (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

  • width (int) – The width (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

  • top (int) – The top most point (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

  • height (int) – The height (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

  • landmarks_xy (list) – The 68 point landmarks as discovered in plugins.extract.align. Should be a list of 68 (x, y) tuples with each of the landmark co-ordinates.

  • mask (dict) – The generated mask(s) for the face as generated in plugins.extract.mask. Must be a dict of {name (str): Mask}.

image

This is a generic image placeholder that should not be relied on to be holding a particular image. It may hold the source frame that holds the face, a cropped face or a scaled image depending on the method using this object.

Type:

numpy.ndarray, optional

left

The left most point (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

Type:

int

width

The width (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

Type:

int

top

The top most point (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

Type:

int

height

The height (in pixels) of the face’s bounding box as discovered in plugins.extract.detect

Type:

int

landmarks_xy

The 68 point landmarks as discovered in plugins.extract.align.

Type:

list

mask

The generated mask(s) for the face as generated in plugins.extract.mask. Is a dict of {name (str): Mask}.

Type:

dict

add_identity(name: str, embedding: ndarray) None

Add an identity embedding to this detected face. If an identity already exists for the given name it will be overwritten

Parameters:
  • name (str) – The name of the mechanism that calculated the identity

  • embedding (numpy.ndarray) – The identity embedding

add_landmarks_xy(landmarks: ndarray) None

Add landmarks to the detected face object. If landmarks alread exist, they will be overwritten.

Parameters:

landmarks (numpy.ndarray) – The 68 point face landmarks to add for the face

add_mask(name: str, mask: np.ndarray, affine_matrix: np.ndarray, interpolator: int, storage_size: int = 128, storage_centering: CenteringType = 'face') None

Add a Mask to this detected face

The mask should be the original output from plugins.extract.mask If a mask with this name already exists it will be overwritten by the given mask.

Parameters:
  • name (str) – The name of the mask as defined by the plugins.extract.mask._base.name parameter.

  • mask (numpy.ndarray) – The mask that is to be added as output from plugins.extract.mask It should be in the range 0.0 - 1.0 ideally with a dtype of float32

  • affine_matrix (numpy.ndarray) – The transformation matrix required to transform the mask to the original frame.

  • interpolator – The CV2 interpolator required to transform this mask to it’s original frame.

  • int – The CV2 interpolator required to transform this mask to it’s original frame.

  • storage_size – The size the mask is to be stored at. Default: 128

  • (optional) (str) – The size the mask is to be stored at. Default: 128

  • storage_centering – The centering to store the mask at. One of “legacy”, “face”, “head”. Default: “face”

  • (optional) – The centering to store the mask at. One of “legacy”, “face”, “head”. Default: “face”

property aligned: AlignedFace

The aligned face connected to this detected face.

property bottom: int

Bottom point (in pixels) of face detection bounding box within the parent image

Type:

int

clear_all_identities() None

Remove all stored identity embeddings

from_alignment(alignment: AlignmentFileDict, image: ndarray | None = None, with_thumb: bool = False) None

Set the attributes of this class from an alignments file and optionally load the face into the image attribute.

Parameters:
  • alignment (dict) – A dictionary entry for a face from an alignments file containing the keys x, w, y, h, landmarks_xy. Optionally the key thumb will be provided. This is for use in the manual tool and contains the compressed jpg thumbnail of the face to be allocated to thumbnail. Optionally the key ``mask` will be provided, but legacy alignments will not have this key.

  • image (numpy.ndarray, optional) – If an image is passed in, then the image attribute will be set to the cropped face based on the passed in bounding box co-ordinates

  • with_thumb (bool, optional) – Whether to load the jpg thumbnail into the detected face object, if provided. Default: False

from_png_meta(alignment: PNGHeaderAlignmentsDict) None

Set the attributes of this class from alignments stored in a png exif header.

Parameters:

alignment (dict) – A dictionary entry for a face from alignments stored in a png exif header containing the keys x, w, y, h, landmarks_xy and mask

get_landmark_mask(area: Literal['eye', 'face', 'mouth'], blur_kernel: int, dilation: float) ndarray

Add a L~lib.align.aligned_mask.LandmarksMask to this detected face

Landmark based masks are generated from face Aligned Face landmark points. An aligned face must be loaded. As the data is coming from the already aligned face, no further mask cropping is required.

Parameters:
  • area (["face", "mouth", "eye"]) – The type of mask to obtain. face is a full face mask the others are masks for those specific areas

  • blur_kernel (int) – The size of the kernel for blurring the mask edges

  • dilation (float) – The amount of dilation to apply to the mask. as a percentage of the mask size

Returns:

The generated landmarks mask for the selected area

Return type:

numpy.ndarray

Raises:

FaceSwapError – If the aligned face does not contain the correct landmarks to generate a landmark mask

get_training_masks() ndarray | None

Obtain the decompressed combined training masks.

Returns:

A 3D array containing the decompressed training masks as uint8 in 0-255 range if training masks are present otherwise None

Return type:

numpy.ndarray

property identity: dict[str, numpy.ndarray]

Identity mechanism as key, identity embedding as value.

Type:

dict

property landmarks_xy: ndarray

The aligned face connected to this detected face.

load_aligned(image: np.ndarray | None, size: int = 256, dtype: str | None = None, centering: CenteringType = 'head', coverage_ratio: float = 1.0, force: bool = False, is_aligned: bool = False, is_legacy: bool = False) None

Align a face from a given image.

Aligning a face is a relatively expensive task and is not required for all uses of the DetectedFace object, so call this function explicitly to load an aligned face.

This method plugs into lib.align.AlignedFace to perform face alignment based on this face’s landmarks_xy. If the face has already been aligned, then this function will return having performed no action.

Parameters:
  • image (numpy.ndarray) – The image that contains the face to be aligned

  • size (int) – The size of the output face in pixels

  • dtype (str, optional) – Optionally set a dtype for the final face to be formatted in. Default: None

  • centering (["legacy", "face", "head"], optional) – The type of extracted face that should be loaded. “legacy” places the nose in the center of the image (the original method for aligning). “face” aligns for the nose to be in the center of the face (top to bottom) but the center of the skull for left to right. “head” aligns for the center of the skull (in 3D space) being the center of the extracted image, with the crop holding the full head. Default: “head”

  • coverage_ratio (float, optional) – The amount of the aligned image to return. A ratio of 1.0 will return the full contents of the aligned image. A ratio of 0.5 will return an image of the given size, but will crop to the central 50%% of the image. Default: 1.0

  • force (bool, optional) – Force an update of the aligned face, even if it is already loaded. Default: False

  • is_aligned (bool, optional) – Indicates that the image is an aligned face rather than a frame. Default: False

  • is_legacy (bool, optional) – Only used if is_aligned is True. True indicates that the aligned image being loaded is a legacy extracted face rather than a current head extracted face

Notes

This method must be executed to get access to the following an AlignedFace object

property right: int

Right point (in pixels) of face detection bounding box within the parent image

Type:

int

store_training_masks(masks: list[numpy.ndarray | None], delete_masks: bool = False) None

Concatenate and compress the given training masks and store for retrieval.

Parameters:
  • masks (list) – A list of training mask. Must be all be uint-8 3D arrays of the same size in 0-255 range

  • delete_masks (bool, optional) – True to delete any of the Mask objects owned by this detected face. Use to free up unrequired memory usage. Default: False

to_alignment() AlignmentFileDict

Return the detected face formatted for an alignments file

Returns:

alignment – The alignment dict will be returned with the keys x, w, y, h, landmarks_xy, mask. The additional key thumb will be provided if the detected face object contains a thumbnail.

Return type:

dict

to_png_meta() PNGHeaderAlignmentsDict

Return the detected face formatted for insertion into a png itxt header.

returns: dict

The alignments dict will be returned with the keys x, w, y, h, landmarks_xy and mask

lib.align.detected_face.update_legacy_png_header(filename: str, alignments: Alignments) PNGHeaderDict | None

Update a legacy extracted face from pre v2.1 alignments by placing the alignment data for the face in the png exif header for the given filename with the given alignment data.

If the given file is not a .png then a png is created and the original file is removed

Parameters:
  • filename (str) – The image file to update

  • alignments (lib.align.alignments.Alignments) – The alignments data the contains the information to store in the image header. This must be a v2.0 or less alignments file as later versions no longer store the face hash (not required)

Returns:

The metadata that has been applied to the given image

Return type:

dict

pose module

Handles pose estimates based on aligned face data

Holds estimated pose information for a faceswap aligned face

class lib.align.pose.PoseEstimate(landmarks: ndarray, landmarks_type: LandmarkType)

Bases: object

Estimates pose from a generic 3D head model for the given 2D face landmarks.

Parameters:
  • landmarks (numpy.ndarry) – The original 68 point landmarks aligned to 0.0 - 1.0 range

  • landmarks_type (LandmarksType) – The type of landmarks that are generating this face

References

Head Pose Estimation using OpenCV and Dlib - https://www.learnopencv.com/tag/solvepnp/ 3D Model points - http://aifi.isr.uc.pt/Downloads/OpenGL/glAnthropometric3DModel.cpp

property offset: dict[CenteringType, np.ndarray]

The amount to offset a standard 0.0 - 1.0 umeyama transformation matrix for a from the center of the face (between the eyes) or center of the head (middle of skull) rather than the nose area.

Type:

dict

property pitch: float

The pitch of the aligned face in eular angles

Type:

float

property roll: float

The roll of the aligned face in eular angles

Type:

float

property xyz_2d: ndarray

numpy.ndarray projected (x, y) coordinates for each x, y, z point at a constant distance from adjusted center of the skull (0.5, 0.5) in the 2D space.

property yaw: float

The yaw of the aligned face in eular angles

Type:

float

thumbnails module

Handles creation of jpg thumbnails for storage in alignment files/png headers

Handles the generation of thumbnail jpgs for storing inside an alignments file/png header

class lib.align.thumbnails.Thumbnails(alignments: Alignments)

Bases: object

Thumbnail images stored in the alignments file.

The thumbnails are stored as low resolution (64px), low quality jpg in the alignments file and are used for the Manual Alignments tool.

Parameters:

alignments (:class:'~lib.align.alignments.Alignments`) – The parent alignments class that these thumbs belong to

add_thumbnail(frame: str, face_index: int, thumb: ndarray) None

Add a thumbnail for the given face index for the given frame.

Parameters:
  • frame (str) – The name of the frame to add the thumbnail for

  • face_index (int) – The face index within the given frame to add the thumbnail for

  • thumb (numpy.ndarray) – The encoded jpg thumbnail at 64px to add to the alignments file

get_thumbnail_by_index(frame_index: int, face_index: int) ndarray

Obtain a jpg thumbnail from the given frame index for the given face index

Parameters:
  • frame_index (int) – The frame index that contains the thumbnail

  • face_index (int) – The face index within the frame to retrieve the thumbnail for

Returns:

The encoded jpg thumbnail

Return type:

numpy.ndarray

property has_thumbnails: bool

True if all faces in the alignments file contain thumbnail images otherwise False.

Type:

bool

updater module

Handles the update of alignments files to the latest version

Handles updating of an alignments file from an older version to the current version.

class lib.align.updater.FileStructure(alignments: Alignments)

Bases: _Updater

Alignments were structured: {frame_name: <list of faces>}. We need to be able to store information at the frame level, so new structure is: {frame_name: {faces: <list of faces>}}

test() bool

Test whether the alignments file is laid out in the old structure of {frame_name: [faces]}

Returns:

True if the file has legacy structure otherwise False

Return type:

bool

update() int

Update legacy alignments files from the format {frame_name: [faces} to the format {frame_name: {faces: [faces]}.

Returns:

The number of items that were updated

Return type:

int

class lib.align.updater.IdentityAndVideoMeta(alignments: Alignments)

Bases: _Updater

Prior to version 2.3 the identity key did not exist and the video_meta key was not compulsory. These should now both always appear, but do not need to be populated.

test() bool

Identity Key was introduced in alignments version 2.3

Returns:

True identity key needs inserting otherwise False

Return type:

bool

update() int

Add the video_meta and identity keys to the alignment file and leave empty

Returns:

The number of keys inserted

Return type:

int

class lib.align.updater.LandmarkRename(alignments: Alignments)

Bases: _Updater

Landmarks renamed from landmarksXY to landmarks_xy for PEP compliance

test() bool

check for legacy landmarksXY keys.

Returns:

True if the alignments file contains legacy landmarksXY keys otherwise False

Return type:

bool

update() int

Update legacy landmarksXY keys to PEP compliant landmarks_xy keys.

Returns:

The number of landmarks keys that were changed

Return type:

int

class lib.align.updater.Legacy(alignments: Alignments)

Bases: object

Legacy alignments properties that are no longer used, but are still required for backwards compatibility/upgrading reasons.

Parameters:

alignments (Alignments) – The alignments object that requires these legacy properties

property hashes_to_alignment: dict[str, AlignmentFileDict]

The SHA1 hash of the face mapped to the alignment for the face that the hash corresponds to. The structure of the dictionary is:

Notes

This method is deprecated and exists purely for updating legacy hash based alignments to new png header storage in lib.align.update_legacy_png_header.

The first time this property is referenced, the dictionary will be created and cached. Subsequent references will be made to this cached dictionary.

Type:

dict

property hashes_to_frame: dict[str, dict[str, int]]

The SHA1 hash of the face mapped to the frame(s) and face index within the frame that the hash corresponds to. The structure of the dictionary is:

{SHA1_hash (str): {filename (str): face_index (int)}}.

Notes

This method is deprecated and exists purely for updating legacy hash based alignments to new png header storage in lib.align.update_legacy_png_header.

The first time this property is referenced, the dictionary will be created and cached. Subsequent references will be made to this cached dictionary.

Type:

dict

class lib.align.updater.ListToNumpy(alignments: Alignments)

Bases: _Updater

Landmarks stored as list instead of numpy array

test() bool

check for legacy landmarks stored as list rather than numpy.ndarray.

Returns:

True if not all landmarks are numpy.ndarray otherwise False

Return type:

bool

update() int

Update landmarks stored as list to numpy.ndarray.

Returns:

The number of landmarks keys that were changed

Return type:

int

class lib.align.updater.MaskCentering(alignments: Alignments)

Bases: _Updater

Masks not containing the stored_centering parameters. Prior to this implementation all masks were stored with face centering

test() bool

Mask centering was introduced in alignments version 2.2

Returns:

True mask centering requires updating otherwise False

Return type:

bool

update() int

Add the mask key to the alignment file and update the centering of existing masks

Returns:

The number of masks that were updated

Return type:

int

class lib.align.updater.VideoExtension(alignments: Alignments, video_filename: str)

Bases: _Updater

Alignments files from video files used to have a dummy ‘.png’ extension for each of the keys. This has been changed to be file extension of the original input video (for better) identification of alignments files generated from video files

Parameters:
  • alignments (Alignments) – The alignments object that is being tested and updated

  • video_filename (str) – The video filename that holds these alignments

test() bool

Requires update if the extension of the key in the alignment file is not the same as for the input video file

Returns:

True if the key extensions need updating otherwise False

Return type:

bool

update() int

Update alignments files that have been extracted from videos to have the key end in the video file extension rather than ‘,png’ (the old way)

Parameters:

video_filename (str) – The filename of the video file that created these alignments