Ghost V2

The goal of this project is to try to implement Ghost but with other face detection and recognition models than InsightFace RetinaFace and ArcFace to allow a more permissive licence than the InsightFace ones. It includes a full rewrite of the original Ghost repository code, integrating Pytorch Lightning to boost training and using other datasets than VGGFace2.

Image Swap Results

Reminder of GHOST Ethics

Here you can find the ethic chart written by the original authors of Ghost which still holds today:

"Deepfake stands for a face swapping algorithm where the source and target can be an image or a video. Researchers have investigated sophisticated generative adversarial networks (GAN), autoencoders, and other approaches to establish precise and robust algorithms for face swapping. However, the achieved results are far from perfect in terms of human and visual evaluation. In this study, we propose a new one-shot pipeline for image-to-image and image-to-video face swap solutions - GHOST (Generative High-fidelity One Shot Transfer).

Deep fake synthesis methods have been improved a lot in quality in recent years. The research solutions were wrapped in easy-to-use API, software and different plugins for people with a little technical knowledge. As a result, almost anyone is able to make a deepfake image or video by just doing a short list of simple operations. At the same time, a lot of people with malicious intent are able to use this technology in order to produce harmful content. High distribution of such a content over the web leads to caution, disfavor and other negative feedback to deepfake synthesis or face swap research.

As a group of researchers, we are not trying to denigrate celebrities and statesmen or to demean anyone. We are computer vision researchers, we are engineers, we are activists, we are hobbyists, we are human beings. To this end, we feel that it's time to come out with a standard statement of what this technology is and isn't as far as us researchers are concerned.

GHOST is not for creating inappropriate content.
GHOST is not for changing faces without consent or with the intent of hiding its use.
GHOST is not for any illicit, unethical, or questionable purposes.
GHOST exists to experiment and discover AI techniques, for social or political commentary, for movies, and for any number of ethical and reasonable uses.

We are very troubled by the fact that GHOST can be used for unethical and disreputable things. However, we support the development of tools and techniques that can be used ethically as well as provide education and experience in AI for anyone who wants to learn it hands-on. Now and further, we take a zero-tolerance approach and total disregard to anyone using this software for any unethical purposes and will actively discourage any such uses."

Disclaimer

We understand the unethical potential of GhostV2 and are committed to protecting against such behavior. The repository has been modified to prevent the processing of inappropriate content, including nudity, graphic content, and sensitive content. Collaboration with websites that promote the use of unauthorized software is strictly prohibited. Those who intend to engage in such activities will be subject to repercussions, such as being reported to authorities for violating the law.

Installation

Clone this repository

git clone https://github.com/dimitribarbot/ghostv2.git
cd ghostv2

Install dependent packages

pip install -r requirements.txt

Download weights

To only download the needed models for inference run this script from the root folder of the repository:

sh download_inference_models.sh

To download all the needed models for inference, dataset preprocessing and training, run this script from the root folder of the repository:

sh download_all_models.sh

Usage

For the moment, face swap only works for single images containing a single face (in case of multiple faces, the first face will be used, sorted by left eye and then right eye coordinates).

Run inference using our GhostV2 pretrained model by specifying the path to a source file containing the face to be swapped into the target at target file path. The output image will be created at output file path:

python inference.py --source_file_path={PATH_TO_IMAGE} --target_file_path={PATH_TO_IMAGE} --output_file_path={PATH_TO_IMAGE}

Note that an NSFW filter has been added to prevent the creation of malicious content.

Inference Options

By default, after main model inference, an enhancing step using GFPGAN v1.4 model will be performed, followed by a face paste back step. For this last step, we provide multiple options:

ghost: adapted from the GhostV1, this version uses FaceAlignment to get facial landmarks in the source and target images in order to paste the output face into the target image. This option is the default.
facexlib_with_parser: the code was largely inspired by facexlib. This version uses face-parsing.PyTorch internally to parse the output face and paste it into the target image.
facexlib_without_parser: the code was largely inspired by facexlib. This version only uses code to paste the output face into the target image.
insightface: the code was largely inspired by insightface. This version only uses code to paste the output face into the target image.
basic: this option directly uses the output of the main model inference to paste the output face into the target image.
none: no paste back will be done, the returned image will be the swapped face only (256x256), not the face swapped in the target image.

Eventually, after paste back, an extra step may be done when choosing either ghost or facexlib_with_parser paste back option. We propose to inpaint face edges using the SDXL inpainting model to improve the output results.

All command line optional parameters can be found in this argument file.

GhostV1 Inference

It is still possible to run inference with the original version of Ghost for comparison. To do that, first run the download_all_models.sh script and then run the inference script with the following parameters:

--G_path=./weights/GhostV1/G_unet_2blocks.safetensors
--face_embeddings=arcface
--align_mode=insightface_v1

Note however that the ArcFace model used internally follows the InsightFace licence.

Demo

It is possible to replicate the source/target matrix of the Image Swap Results section by running the following script:

python demo.py

Internally, it uses the same command line parameters as for inference. Options can be found in this argument file.

Dataset Preprocessing

We provide scripts to prepare the datasets used for training. We mainly use two datasets for our training stage:

Laion Face dataset: this dataset contains 50 million images with faces. For our pretrained model, we only downloaded the first part out of the 32 parts it contains.
Lagenda dataset: originally used for age and gender recognition tasks, this dataset is well suited for our face swap task. It can be used to train a model faster than with the Laion-Face dataset.

We experimented a lot with the dataset preprocessing and we come up with the following proposed solution:

We exclude images that are too small and contain faces that are too small,
We use FaceAlignment and Live Portrait landmark models and code to exclude faces which are not fully visible,
We use Live Portrait to generate various versions of the same face with random facial expressions,
We optionnaly use GFPGAN v1.2 to enhance face quality.

Specific arguments for the Laion-Face dataset preprocessing, such as the dataset location, can be found in this argument file. Specific arguments for the Lagenda dataset preprocessing, such as the dataset location, can be found in this argument file.

N.B.: For the Laion Face dataset, you may want to download it using the explanations given here.

Face Alignment

We tried several alignment techniques while preprocessing the datasets, and we found that the latest version of the InsightFace alignment code gives the best results. The list of the distinct alignment techniques and other preprocessing parameters can be found in this argument file.

It is also possible to compare the various alignment modes by running the following command:

For a single image:

python align.py --source_image={PATH_TO_IMAGE} --aligned_folder={OUTPUT_PATH} --align_mode={ALIGN_OPTION}

Or for an entire folder:

python align.py --source_folder={PATH_TO_IMAGES} --aligned_folder={OUTPUT_PATH} --align_mode={ALIGN_OPTION}

All alignment command line parameters can be found in this argument file.

Dataset Image Format

You may want to convert your images to a common .jpg or .png format. To do this for a single file or recursively on a large amount of images, you can use the following script:

For a single image:

python convert.py --source_image={PATH_TO_IMAGE} --output_folder={OUTPUT_PATH} --output_extension={EXTENSION_OPTION}

Or for an entire folder:

python convert.py --source_folder={PATH_TO_IMAGES} --output_folder={OUTPUT_PATH} --output_extension={EXTENSION_OPTION}

Where EXTENSION_OPTION is either .png or .jpg.

All conversion command line parameters can be found in this argument file.

Training

To train GhostV2, you can run the following script:

python train.py

Training Options

We provide a lot of different options for training.

Internally, we detect faces using the Pytorch RetinaFace model. We then compute face embeddings using one of the available face recognition models:

The original ArcFace model, used by the initial version of Ghost (beware, the model is available for non-commercial research purposes only),
AdaFace, a concurrent model of ArcFace,
CVLFace, by the author of AdaFace, proposing various face recognition models,
Facenet Pytorch, the Pytorch version of David Sandberg's tensorflow facenet.

By default we use ViT AdaFace, which apparently gives the best results, especially in terms of identity preservation.

More information regarding each option can be found in this argument file. If you want to use wandb logging for your experiments, you should login to wandb first --wandb login.

N.B.: The --example_images_path must points to a folder containing test images cropped using the alignment method used to generate your training dataset.

Face Embeddings

It is possible to calculate the distance between embeddings computed using distinct face recognition models or distinct face alignment modes or both. This is useful if you want to know whether you can replace a face recognition model or face alignment algorithm with a given face swap model.

To do this, you can run the following script:

python embedding_distance.py

And play with the --source_face_embeddings, --target_face_embeddings, --source_crop_size, --target_crop_size, --source_align_mode and --target_align_mode parameters.

All command line parameters can be found in this argument file.

Our Experiments

Our pretrained model was trained on a single RTX 4090 card using FP16 mixed precision and the Laion-Face dataset preprocessed as explained in the Dataset Preprocessing section above (around 300000 faces, each one with 10 distinct facial expressions, using insightface_v2 as aligment algorithm) and the CVL ViT face embedding model.

It consisted of two phases:

A first run of 4 epochs (~20 hours) with a batch size of 32 and default parameters set in the training arguments file (no scheduler),
A second run of 1 epoch (~20 hours as well) with a batch size of 16 (due to the 24GB memory limit of the RTX 4090 card), --eye_detector_loss enabled, --weight_id=70 and --weight_eyes=1200, and using the G and D files of the previous run. We also use a scheduler for both the G and D models by setting the --use_scheduler flag and the default scheduler parameters of the training arguments file.

Tips

In case of finetuning you can variate losses coefficients to make the output look similar to the source identity, or vice versa, to save features and attributes of target face.
You can change the backbone of the attribute encoder and num_blocks of AAD ResBlk using parameters --backbone and --num_blocks.
During the finetuning stage you can use our pretrain weights for generator and discriminator that are located in the weights folder. We provide the weights for models with U-Net backbone with 2 blocks in AAD ResBlk.

Discussion & Improvements

Known Caveats

The output are not as good as InsightFace and post-processing is needed to achieve the best results.

Currently, we propose 2 optional post-processing steps :

Face restoration using GFPGAN v1.4 (or v1.2),
Face edge inpainting using diffusers SDXL inpainting model

Here are comparisons with and without post-processing:

Without face restoration and without face edge inpainting:

Without face edge inpainting but with face restoration:

Without face restoration but with face edge inpainting:

With face restoration and face edge inpainting:

Known Improvements

This project can still be improved. Here is a list of known topics:

Add video face swap as in original Ghost repository.
Use Pytorch Lightning CLI to train the model using various configurations.
Create an onnx version of the pretrained model.
And of course, improve the face swap result!

License

The pretrained models and source code of this repository are under the BSD-3 Clause license.

file	source	license
GhostV2 Discriminator	dimitribarbot/ghostv2
GhostV2 Generator	dimitribarbot/ghostv2

Thanks

The models and code used in this repository are:

file	source	license
Ghost (v1)	ai-forever/ghost
AdaFace	mk-minchul/AdaFace
CVL ArcFace IR101	mk-minchul/CVLface
CVL AdaFace IR101	mk-minchul/CVLface
CVL AdaFace ViT	mk-minchul/CVLface
CVL AdaFace DFA mobilenet	mk-minchul/CVLface
Facenet Pytorch	timesler/facenet-pytorch
AdaptiveWingLoss	protossw512/AdaptiveWingLoss
Pytorch RetinaFace	biubug6/Pytorch_Retinaface
Face Alignment	1adrianb/face-alignment
Face Parsing Pytorch	zllrunning/face-parsing.PyTorch
Live Portrait Appearance Feature Extractor	KwaiVGI/LivePortrait
Live Portrait Landmark	KwaiVGI/LivePortrait
Live Portrait Motion Extractor	KwaiVGI/LivePortrait
Live Portrait Spade Generator	KwaiVGI/LivePortrait
Live Portrait Stitching Retargeting Module	KwaiVGI/LivePortrait
Live Portrait Warping Module	KwaiVGI/LivePortrait
GFPGANv1.2	TencentARC/GFPGAN
GFPGANv1.4	TencentARC/GFPGAN
NSFW Filter	AdamCodd/vit-base-nsfw-detector
SDXL Inpainting	diffusers/stable-diffusion-xl-1.0-inpainting-0.1
InsightFace (code)	deepinsight/insightface
ArcFace (optional)	deepinsight/insightface

The datasets used in this project are:

dataset	source	license
LAION-Face	FacePerceiver/LAION-Face
Lagenda	WildChlamydia/Lagenda

Thanks to everyone who makes this project possible!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ghost V2

Image Swap Results

Reminder of GHOST Ethics

Disclaimer

Installation

Usage

Inference Options

GhostV1 Inference

Demo

Dataset Preprocessing

Face Alignment

Dataset Image Format

Training

Training Options

Face Embeddings

Our Experiments

Tips

Discussion & Improvements

Known Caveats

Known Improvements

License

Thanks

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
AdaFace		AdaFace
AdaptiveWingLoss/core		AdaptiveWingLoss/core
ArcFace		ArcFace
BiSeNet		BiSeNet
CVLFace		CVLFace
FaceAlignment		FaceAlignment
Facenet		Facenet
GFPGAN		GFPGAN
Ghost		Ghost
LivePortrait		LivePortrait
RetinaFace		RetinaFace
examples		examples
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
align.py		align.py
convert.py		convert.py
demo.py		demo.py
download_all_models.sh		download_all_models.sh
download_hf_models.py		download_hf_models.py
download_inference_models.sh		download_inference_models.sh
embedding_distance.py		embedding_distance.py
inference.py		inference.py
preprocess_lagenda.py		preprocess_lagenda.py
preprocess_laion.py		preprocess_laion.py
requirements.txt		requirements.txt
train.py		train.py

License

dimitribarbot/ghostv2

Folders and files

Latest commit

History

Repository files navigation

Ghost V2

Image Swap Results

Reminder of GHOST Ethics

Disclaimer

Installation

Usage

Inference Options

GhostV1 Inference

Demo

Dataset Preprocessing

Face Alignment

Dataset Image Format

Training

Training Options

Face Embeddings

Our Experiments

Tips

Discussion & Improvements

Known Caveats

Known Improvements

License

Thanks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages