1. Introduction

Caricature is the process to create artistic creation of any image, specifically faces of human beings, while maintaining its identity. The usage is caricature could be use in social media as profile image which enables us to express sentiments. Caricature draws facial features of a person face with extreme distortion.  It’s not like common sketch because a sketch preserves facial structure and features to a large extent. A caricature enables variations and additions with considering expressions, point of view, appearance, and also keeping artistic style. With the increase in the usage of social media platforms and its usage caricature fulfills the requirement. The output of every generated caricature should be visually appealing and satisfactory. The output of the caricature style should be very consistent with normal cartoons. The caricatures exhibit following qualities:

Exaggeration:  Face is deformed from different parts, while keeping characteristic intact.  Different parts of the input face should be deformed in a reasonable way to exaggerate the prominent characteristics of the face;

Diversity: Given an input face, diverse caricatures with different styles should be generated.

Deep learning plays an important role in caricature development. Although, presumably it is a difficult task because the extreme level of distortions are very challenging to manage. The most challenging part is to introduce distortions to a face image while keeping the integrity of the face. In spite of the countless distortions in a caricature, humans are practiced in recognizing the characteristics of the caricature as well as verifying whether the caricature belongs to a visual image of a same face correspond to same. Deep learning models have been doing well at face verification and recognition [1, 2], but in case of caricatures the difficulty stems from the distortions introduced in the caricatures which varies from one image to another. Caricatures being unclear views of real faces still pose some idiosyncratic characteristics that help humans to verify and identify the identities in the images. These distinctive characteristics present in both the modalities if captured accurately can aid in caricature verification and identification tasks. In our product, we tackle the two problems of caricature verification and identification. Caricature-visual verification refers to the job of verifying whether the two input images of different images which correspond to the same identity. Caricature recognition intends at categorize the person in the caricature image. While, traditional face recognition has been investigated to a great extent. A very less amount of work has been done  

All the caricature recognition approaches requires manual extraction of feature till date. Below is the literature showing current work in this regard.  Authors in  [3], used manual definition of facial key points from the images. One of the earlier work [4] on caricature identification  explores various machine learning models on a small dataset containing 196 pairs of face images and caricatures. Authors in  [5], used SVM classifiers to manually extract a combination of low level features using histogram of gradients followed by principal component analysis for recognition using canonical correlation analysis (CCA). Authors in [3] again used SVM and on extracted features from relatively little dataset constituting 200 pairs of caricatures and visual images. Further, a genetic algorithm along with logistic regression is also used which enables to find optimum weights for reducing distance two types of feature (caricature and photograph features)

We propose a new image fusion mechanism to encourage the model to focus on both the global and local appearance of the generated caricatures, and pay more attention to the key facial parts.

Alpes Product Functioning: Instead of a caricature, we will first see how to achieve a cartoon effect using regular image processing.

An example output of former USA president Barack Obama




Screenshot_2020-04-10-19-53-23-04 (1).png


[1] D. P. Kingma and M. Welling. Auto-encoding variationalv bayes. International Conference on Learning Representations (ICLR), 2014.

[2] H. Koshimizu, M. Tominaga, T. Fujiwara, and K. Murakami. On kansei facial image processing for computerized facial caricaturing system picasso. In IEEE International Conference on Systems, Man, and Cybernetics (SMC), volume 6, pages 294–299. IEEE, 1999.

[3] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks. In International Conference on Machine Learning (ICML), pages 214–223, 2017.

[4] M. I. Belghazi, A. Baratin, S. Rajeshwar, S. Ozair, Y. Bengio, D. Hjelm, and A. Courville. Mutual information neural

estimation. In International Conference on Machine Learning (ICML), pages 530–539, 2018.

[5] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In International Conference on Machine Learning (ICML), pages 1857–1865, 2017.

Leave a Reply

Your email address will not be published. Required fields are marked *