Face Swapping

AI based Face swapping is one of the most popular concepts. There are a number of Apps and techniques by which we can swap the face like Draw Something, Dubsmash, Flappy Bird, FatBooth and (in the earliest days of Apple’s App Store) virtual pint-drinking and lightsaber. We can say that it has become a mobile craze since 2019.

These applications normally work like:

It lets us switch faces, for example: we can select any funny pictures and swap with someone else in real time.
We can record video or even with live stream a face swapping can be done, unlike static pictures.

Techniques for Face Swapping

There are a number of face swapping techniques available. Basic Face swapping techniques are mentioned below:

1. Landmark Detection

It is defined as the process which identifies the point of interest in an image of the human face. In opencv dlib is used for facial landmark detection, it extracts some unique fashion landmarks for the whole face. In this method, multiple iterations happened. After the first iteration of the algorithm, an initial value “0” is given and this value increases linearly such that at the end of the iteration it gets the value 10. After completing the iterations an image evolved with the ‘ground truth’, it means iteration can stop now. After identifying landmarks, another image can be blended over it.

2. Convex HUll

The convex hull is a set of points defined as the smallest convex polygon, which encloses all of the points in the set. This means that for a given set of points, the convex hull is the subset of these points such that all the given points are inside the subset. To find the face border in an image, we need to change the structure a bit. The structure is first passed to the convex hull function with return points to false, this means that we get an output of indexes. Brzęczkowski then exhibited the face border in the image in blue color using the find_convex_hull.py function.

3. Triangulation

This technique is based on the inputs obtained after triangulation [1]. Hence can we make some assumption about the 2D image to approximate 3D information of the face. One simple way is to triangulate using the facial landmarks as corners and then make the assumption that in each triangle the content is planar (forms a plane in 3D) and hence the warping between the triangles in two images is affine. Triangulating or forming a triangular mesh over the 2D image is simple but we want to triangulate such that it’s fast and has an “efficient” triangulation. One such method is obtained by drawing the dual of the Voronoi diagram, i.e., connecting each two neighboring sites in the Voronoi diagram. This is called the Delaunay Triangulation and can be constructed in O(nlogn) time. We want the triangulation to be consistent with the image boundary such that texture regions won’t fade into the background while warping. Delaunay Triangulation tries the maximize the smallest angle in each triangle

There are a number of Issues in these techniques like Angle of Face is not detectable and closing is difficult. However we have used Position Map Regression Network [2] for face swapping.

The main features are:

End-to-End: our method can directly regress the 3D facial structure and dense alignment from a single image bypassing 3DMM fitting.
Multi-task: By regressing position map, the 3D geometry along with semantic meaning can be obtained. Thus, we can effortlessly complete the tasks of dense alignment, monocular 3D face reconstruction, pose estimation, etc.
Faster than real-time: The method can run at over 100fps(with GTX 1080) to regress a position map.
Robust Tested: on facial images in unconstrained conditions. Our method is robust to poses, illuminations and occlusions.

Tools Used:

Python 2.7 (numpy, skimage, scipy)
TensorFlow >= 1.4
Optional:
dlib (for detecting face. You do not have to install if you can provide bounding box information. )
opencv2 (for showing results)

Applications

Face Alignment
3D Face Reconstruction
3D Pose Estimation
Depth image
Texture Editing

Results

Following is the image used as the input and output

NOTE: The input images are obtained from myntra.com

References:

[1] https://cmsc733.github.io/2019/proj/p2/

[2] https://github.com/YadiraF/PRNet

One Response

Anatomia says:

August 18, 2021 at 2:25 pm

Thanks for taking the time to discuss this, I feel strongly about it and love understanding far more on this topic. If achievable, as you acquire knowledge, would you thoughts updating your blog with additional information? It is extremely useful for me.

Reply

Explainable AI

One Response

Leave a Reply Cancel reply

Location

Stay Connected