image module

Handles loading and manipulation of images in Faceswap.

Module Summary

`FacesLoader`	Loads faces from a faces folder along with the face's Faceswap metadata.
`FfmpegReader`	Monkey patch imageio ffmpeg to use keyframes whilst seeking
`ImageIO`	Perform disk IO for images or videos in a background thread.
`ImagesLoader`	Perform image loading from a folder of images or a video.
`ImagesSaver`	Perform image saving to a destination folder.
`SingleFrameLoader`	Allows direct access to a frame by filename or frame index.
`batch_convert_color`	Convert a batch of images from one color space to another.
`count_frames`	Count the number of frames in a video file
`encode_image`	Encode an image.
`generate_thumbnail`	Generate a jpg thumbnail for the given image.
`hex_to_rgb`	Convert a hex number to it's RGB counterpart.
`png_read_meta`	Read the Faceswap information stored in a png's iTXt field.
`png_write_meta`	Write Faceswap information to a png's iTXt field.
`read_image`	Read an image file from a file location.
`read_image_batch`	Load a batch of images from the given file locations.
`read_image_meta`	Read the Faceswap metadata stored in an extracted face's exif header.
`read_image_meta_batch`	Read the Faceswap metadata stored in a batch extracted faces' exif headers.
`rgb_to_hex`	Convert an RGB tuple to it's hex counterpart.

Module

Utilities for working with images and videos

class lib.image.FacesLoader(path, skip_list=None, count=None)

Bases: ImagesLoader

Loads faces from a faces folder along with the face’s Faceswap metadata.

Examples

Loading faces with their Faceswap metadata:

>>> loader = FacesLoader('/path/to/faces/folder')
>>> for filename, face, metadata in loader.load():
>>>     <do processing>

class lib.image.FfmpegReader(format, request)

Bases: Reader

Monkey patch imageio ffmpeg to use keyframes whilst seeking

get_frame_info(frame_pts=None, keyframes=None)

Store the source video’s keyframes in _frame_info" for the current video for use in :func:`initialize.

Parameters:

frame_pts (list, optional) – A list corresponding to the video frame count of the pts_time per frame. If this and keyframes are provided, then analyzing the video is skipped and the values from the given lists are used. Default: None
keyframes (list, optional) – A list containing the frame numbers of each key frame. if this and frame_pts are provided, then analyzing the video is skipped and the values from the given lists are used. Default: None

class lib.image.ImageIO(path, queue_size, args=None)

Bases: object

Perform disk IO for images or videos in a background thread.

This is the parent thread for ImagesLoader and ImagesSaver and should not be called directly.

Parameters:

path (str or list) – The path to load or save images to/from. For loading this can be a folder which contains images, video file or a list of image files. For saving this must be an existing folder.
queue_size (int) – The amount of images to hold in the internal buffer.
args (tuple, optional) – The arguments to be passed to the loader or saver thread. Default: None

See also

lib.image.ImagesLoader: Background Image Loader inheriting from this class.
lib.image.ImagesSaver: Background Image Saver inheriting from this class.

close(): Closes down and joins the internal threads

property location

The folder or video that was passed in as the path parameter.

Type:: str

class lib.image.ImagesLoader(path, queue_size=8, fast_count=True, skip_list=None, count=None)

Bases: ImageIO

Perform image loading from a folder of images or a video.

Images will be loaded and returned in the order that they appear in the folder, or in the video to ensure deterministic ordering. Loading occurs in a background thread, caching 8 images at a time so that other processes do not need to wait on disk reads.

See also ImageIO for additional attributes.

Parameters:

path (str or list) – The path to load images from. This can be a folder which contains images a video file or a list of image files.
queue_size (int, optional) – The amount of images to hold in the internal buffer. Default: 8.
fast_count (bool, optional) – When loading from video, the video needs to be parsed frame by frame to get an accurate count. This can be done quite quickly without guaranteed accuracy, or slower with guaranteed accuracy. Set to True to count quickly, or False to count slower but accurately. Default: True.
skip_list (list, optional) – Optional list of frame/image indices to not load. Any indices provided here will be skipped when executing the load() function from the given location. Default: None
count (int, optional) – If the number of images that the loader will encounter is already known, it can be passed in here to skip the image counting step, which can save time at launch. Set to None if the count is not already known. Default: None

Examples

Loading from a video file:

>>> loader = ImagesLoader('/path/to/video.mp4')
>>> for filename, image in loader.load():
>>>     <do processing>

add_skip_list(skip_list)

Add a skip list to this ImagesLoader

Parameters:: skip_list (list) – A list of indices corresponding to the frame indices that should be skipped by the load() function.

property count: int

The number of images or video frames in the source location. This count includes any files that will ultimately be skipped if a skip_list has been provided. See also: process_count

Type:: int

property file_list: list[str]

A full list of files in the source location. This includes any files that will ultimately be skipped if a skip_list has been provided. If the input is a video then this is a list of dummy filenames as corresponding to an alignments file

Type:: list[str]

property fps

For an input folder of images, this will always return 25fps. If the input is a video, then the fps of the video will be returned.

Type:: float

property is_video

True if the input is a video, False if it is not

Type:: bool

load()

Generator for loading images from the given location

If FacesLoader is in use then the Faceswap metadata of the image stored in the image exif file is added as the final item in the output tuple.

Yields:

filename (str) – The filename of the loaded image.
image (numpy.ndarray) – The loaded image.
metadata (dict, (FacesLoader only)) – The Faceswap metadata associated with the loaded image.

property process_count: int

The number of images or video frames to be processed (IE the total count less items that are to be skipped from the skip_list)

Type:: int

class lib.image.ImagesSaver(path, queue_size=8, as_bytes=False)

Bases: ImageIO

Perform image saving to a destination folder.

Images are saved in a background ThreadPoolExecutor to allow for concurrent saving. See also ImageIO for additional attributes.

Parameters:

path (str) – The folder to save images to. This must be an existing folder.
queue_size (int, optional) – The amount of images to hold in the internal buffer. Default: 8.
as_bytes (bool, optional) – True if the image is already encoded to bytes, False if the image is a numpy.ndarray. Default: False.

Examples

>>> saver = ImagesSaver('/path/to/save/folder')
>>> for filename, image in <image_iterator>:
>>>     saver.save(filename, image)
>>> saver.close()

close(): Signal to the Save Threads that they should be closed and cleanly shutdown the saver

save(filename: str, image: bytes | ndarray, sub_folder: str | None = None) → None

Save the given image in the background thread

Ensure that close() is called once all save operations are complete.

Parameters:

filename (str) – The filename of the image to be saved. NB: Any folders passed in with the filename will be stripped and replaced with location.
image (bytes) – The encoded image to be saved
subfolder (str, optional) – If the file should be saved in a subfolder in the output location, the subfolder should be provided here. None for no subfolder. Default: None

class lib.image.SingleFrameLoader(path, video_meta_data=None)

Bases: ImagesLoader

Allows direct access to a frame by filename or frame index.

As we are interested in instant access to frames, there is no requirement to process in a background thread, as either way we need to wait for the frame to load.

Parameters:: video_meta_data (dict, optional) – Existing video meta information containing the pts_time and iskey flags for the given video. Used in conjunction with single_frame_reader for faster seeks. Providing this means that the video does not need to be scanned again. Set to None if the video is to be scanned. Default: None

image_from_index(index: int) → tuple[str, numpy.ndarray]

Return a single image from file_list for the given index.

Parameters:

index (int) – The index number (frame number) of the frame to retrieve. NB: The first frame is index 0

Returns:

filename (str) – The filename of the returned image
image (numpy.ndarray) – The image for the given index

Notes

Retrieving frames from video files can be slow as the whole video file needs to be iterated to retrieve the requested frame. If a frame has already been retrieved, then retrieving frames of a higher index will be quicker than retrieving frames of a lower index, as iteration needs to start from the beginning again when navigating backwards.

We do not use a background thread for this task, as it is assumed that requesting an image by index will be done when required.

property video_meta_data

For videos contains the keys frame_pts holding a list of time stamps for each frame and keyframes holding the frame index of each key frame.

Notes

Only populated if the input is a video and single frame reader is being used, otherwise returns None.

Type:: dict

lib.image.batch_convert_color(batch, colorspace)

Convert a batch of images from one color space to another.

Converts a batch of images by reshaping the batch prior to conversion rather than iterating over the images. This leads to a significant speed up in the convert process.

Parameters:

batch (numpy.ndarray) – A batch of images.
colorspace (str) – The OpenCV Color Conversion Code suffix. For example for BGR to LAB this would be 'BGR2LAB'. See https://docs.opencv.org/4.1.1/d8/d01/group__imgproc__color__conversions.html for a full list of color codes.

Returns:

The batch converted to the requested color space.

Return type:

numpy.ndarray

Example

>>> images_bgr = numpy.array([image1, image2, image3])
>>> images_lab = batch_convert_color(images_bgr, "BGR2LAB")

Notes

This function is only compatible for color space conversions that have the same image shape for source and destination color spaces.

If you use batch_convert_color() with 8-bit images, the conversion will have some information lost. For many cases, this will not be noticeable but it is recommended to use 32-bit images in cases that need the full range of colors or that convert an image before an operation and then convert back.

lib.image.count_frames(filename, fast=False)

Count the number of frames in a video file

There is no guaranteed accurate way to get a count of video frames without iterating through a video and decoding every frame.

count_frames() can return an accurate count (albeit fairly slowly) or a possibly less accurate count, depending on the fast parameter. A progress bar is displayed.

Parameters:

filename (str) – Full path to the video to return the frame count from.
fast (bool, optional) – Whether to count the frames without decoding them. This is significantly faster but accuracy is not guaranteed. Default: False.

Returns:

The number of frames in the given video file.

Return type:

int

Example

>>> filename = "/path/to/video.mp4"
>>> frame_count = count_frames(filename)

lib.image.encode_image(image: np.ndarray, extension: str, encoding_args: tuple[int, ...] | None = None, metadata: PNGHeaderDict | dict[str, T.Any] | bytes | None = None) → bytes

Encode an image.

Parameters:

image (numpy.ndarray) – The image to be encoded in BGR channel order.
extension (str) – A compatible cv2 image file extension that the final image is to be saved to.
encoding_args (tuple[int, ...], optional) – Any encoding arguments to pass to cv2’s imencode function
metadata (dict or bytes, optional) – Metadata for the image. If provided, and the extension is png or tiff, this information will be written to the PNG itxt header. Default:None Can be provided as a python dict or pre-encoded

Returns:

encoded_image – The image encoded into the correct file format as bytes

Return type:

bytes

Example

>>> image_file = "/path/to/image.png"
>>> image = read_image(image_file)
>>> encoded_image = encode_image(image, ".jpg")

lib.image.generate_thumbnail(image, size=96, quality=60)

Generate a jpg thumbnail for the given image.

Parameters:

image (numpy.ndarray) – Three channel BGR image to convert to a jpg thumbnail
size (int) – The width and height, in pixels, that the thumbnail should be generated at
quality (int) – The jpg quality setting to use

Returns:

The given image encoded to a jpg at the given size and quality settings

Return type:

numpy.ndarray

lib.image.hex_to_rgb(hexcode)

Convert a hex number to it’s RGB counterpart.

Parameters:: hexcode (str) – The hex code to convert (e.g. “#0d25ac”)
Returns:: The hex code as a 3 integer (R, G, B) tuple
Return type:: tuple

lib.image.pack_to_itxt(metadata)

Pack the given metadata dictionary to a PNG iTXt header field.

Parameters:: metadata (dict or bytes) – The dictionary to write to the header. Can be pre-encoded as utf-8.
Returns:: A byte encoded PNG iTXt field, including chunk header and CRC
Return type:: bytes

lib.image.png_read_meta(image)

Read the Faceswap information stored in a png’s iTXt field.

Parameters:: image (bytes) – The bytes encoded png file to read header data from
Returns:: The Faceswap information stored in the PNG header
Return type:: dict

Notes

This is a very stripped down, non-robust and non-secure header reader to fit a very specific task. OpenCV will not write any iTXt headers to the PNG file, so we make the assumption that the only iTXt header that exists is the one that Faceswap created for storing alignments.

lib.image.png_write_meta(image: bytes, data: PNGHeaderDict | dict[str, T.Any] | bytes) → bytes

Write Faceswap information to a png’s iTXt field.

Parameters:

image (bytes) – The bytes encoded png file to write header data to
data (dict or bytes) – The dictionary to write to the header. Can be pre-encoded as utf-8.

Notes

This is a fairly stripped down and non-robust header writer to fit a very specific task. OpenCV will not write any iTXt headers to the PNG file, so we make the assumption that the only iTXt header that exists is the one that we created for storing alignments.

References

PNG Specification: https://www.w3.org/TR/2003/REC-PNG-20031110/

lib.image.read_image(filename, raise_error=False, with_metadata=False)

Read an image file from a file location.

Extends the functionality of cv2.imread() by ensuring that an image was actually loaded. Errors can be logged and ignored so that the process can continue on an image load failure.

Parameters:

filename (str) – Full path to the image to be loaded.
raise_error (bool, optional) – If True then any failures (including the returned image being None) will be raised. If False then an error message will be logged, but the error will not be raised. Default: False
with_metadata (bool, optional) – Only returns a value if the images loaded are extracted Faceswap faces. If True then returns the Faceswap metadata stored with in a Face images .png exif header. Default: False

Returns:

If with_metadata is False then returns a numpy.ndarray of the image in BGR channel order. If with_metadata is True then returns a tuple of (numpy.ndarray” of the image in BGR, dict of face’s Faceswap metadata)

Return type:

numpy.ndarray or tuple

Example

>>> image_file = "/path/to/image.png"
>>> try:
>>>    image = read_image(image_file, raise_error=True, with_metadata=False)
>>> except:
>>>     raise ValueError("There was an error")

lib.image.read_image_batch(filenames, with_metadata=False)

Load a batch of images from the given file locations.

Leverages multi-threading to load multiple images from disk at the same time leading to vastly reduced image read times.

Parameters:

filenames (list) – A list of str full paths to the images to be loaded.
with_metadata (bool, optional) – Only returns a value if the images loaded are extracted Faceswap faces. If True then returns the Faceswap metadata stored with in a Face images .png exif header. Default: False

Returns:

The batch of images in BGR channel order returned in the order of filenames

Return type:

numpy.ndarray

Notes

As the images are compiled into a batch, they must be all of the same dimensions.

Example

>>> image_filenames = ["/path/to/image_1.png", "/path/to/image_2.png", "/path/to/image_3.png"]
>>> images = read_image_batch(image_filenames)

lib.image.read_image_meta(filename)

Read the Faceswap metadata stored in an extracted face’s exif header.

Parameters:: filename (str) – Full path to the image to be retrieve the meta information for.
Returns:: The output dictionary will contain the width and height of the png image as well as any itxt information.
Return type:: dict

Example

>>> image_file = "/path/to/image.png"
>>> metadata = read_image_meta(image_file)
>>> width = metadata["width]
>>> height = metadata["height"]
>>> faceswap_info = metadata["itxt"]

lib.image.read_image_meta_batch(filenames)

Read the Faceswap metadata stored in a batch extracted faces’ exif headers.

Leverages multi-threading to load multiple images from disk at the same time leading to vastly reduced image read times. Creates a generator to retrieve filenames with their metadata as they are calculated.

Notes

The order of returned values is non-deterministic so will most likely not be returned in the same order as the filenames

Parameters:: filenames (list) – A list of str full paths to the images to be loaded.
Yields:: tuple – (filename (str), metadata (dict) )

Example

>>> image_filenames = ["/path/to/image_1.png", "/path/to/image_2.png", "/path/to/image_3.png"]
>>> for filename, meta in read_image_meta_batch(image_filenames):
>>>         <do something>

lib.image.rgb_to_hex(rgb)

Convert an RGB tuple to it’s hex counterpart.

Parameters:: rgb (tuple) – The (R, G, B) integer values to convert (e.g. (0, 255, 255))
Returns:: The 6 digit hex code with leading # applied
Return type:: str

lib.image.tiff_read_meta(image: bytes) → dict[str, Any]: Read information stored in a Tiff’s Image Description field

lib.image.tiff_write_meta(image: bytes, data: PNGHeaderDict | dict[str, T.Any] | bytes) → bytes

Write Faceswap information to a tiff’s image_description field.

Parameters:

png (bytes) – The bytes encoded tiff file to write header data to
data (dict or bytes) – The data to write to the image-description field. If provided as a dict, then it should be a json serializable object, otherwise it should be data encoded as ascii bytes

Notes

This handles a very specific task of adding, and populating, an ImageDescription field in a Tiff file generated by OpenCV. For any other usecases it will likely fail

lib.image.update_existing_metadata(filename, metadata)

Update the png header metadata for an existing .png extracted face file on the filesystem.

Parameters:

filename (str) – The full path to the face to be updated
metadata (dict or bytes) – The dictionary to write to the header. Can be pre-encoded as utf-8.