Expression Analysis from sequence of images - a review

Abstract – Inferring expressions from sequence of images is a real challenge and it impacts the recent research in the field of image processing. Communicating intention through facial expressions is easier for a human being but judgment of emotions for a machine is difficult. Facial expression analysis is an intensive research field that aims for better human machine interaction. In this paper we are reviewing research work of various authors and their methodological approaches relative to expression analysis in images and from sequence of images.
Keywords’ Face detection, Facial expression.
I. INTRODUCTION
Facial expression is the emotional state of a human beings facial muscular movement under the skin and the combination of muscles leads to the different human face expressions. Facial expression analysis is an interesting research topic and among various other components facial expressions are the most important. They are mainly responsible for understanding of individual emotions. An intelligent user-interface not only should interpret the face movements but also should interpret the user’s emotional state. The challenging issue is to equip various machines with emotional intelligence such that we can have intelligent machines helpful with manifold applications. Thus current Intelligent Systems and Biometric Systems must have the ability to understand human expressions, should be flexible, robust and should be compatible with other real world application domains like animation, psychiatry, telecommunications, behavioural science, medical science etc. According to the research work done by the researchers for the last decade six universal expressions were recognised which are happy, sad, disgust, surprise, fear and anger. Knowing the emotional state of a person helps in better machine human interaction that will help create smart visual interface and communication between machine and man such that we can have interactive computers that will be able to offer advice to the user according to the user’s mood. For example if the person is sad then the system can automatically tell a joke. Sensitive music juke boxes can be developed that will play a song based on users current state of mind and expressions. Other applications could be entertainment systems for kids, mobile device login, video conferencing, distance learning, educational softwares, automobile monitoring systems that can alert the driver to wake up in case the driver is feeling drowsy, social robots and many more.
In this review paper we categorise the whole recognition methodology into three main sections. The foremost step in any expression detection system is to track and detect face. So the first section of this paper reviews the work of various research publications for face detection. Since facial features are the next parameter on the basis of which emotions can be detected thus researchers contribution towards facial feature extraction will be discussed in the second section. The last section will analyse the emotions based on various classification techniques.
II. LITERATURE REVIEW
Facial expression recognition has three main modules. Face detection is the first and the fundamental step for expression analysis. This step is followed by feature identification and extraction and finally the classifier classifies the features into the basic expressions.
A. Face tracking and detection
Human beings can easily detect faces in any circumstances and conditions such as they can recognise face from large distance and in bad illumination conditions. Human brain can even perceive the missing parts of the face if they are not visible but this is difficult for a machine. Distractions like glasses and facial hair, the scale and orientation factor of the face, noise and occlusion add complexity while detecting facial expressions. According to Froba and Kublbeck[2] Face can be detected by using orientation matching of edges and extract orientation features of edges. They have used pipeline approach that has multiple matching stages to detect face. In [3], Suzuki and Shibata used multiple clue image perception algorithm to determine human face. They have also proposed two approaches for detecting edge features based on distribution of projected principal edge and cell distribution. Huang and Huang[4] used canny edge detection technique to track face in an image. They took a rough estimate of face by focusing on intensity of pixels between the vertical edges and lips that finally mark the face boundary. Face detection from edges is very efficient and time saving when we determine face within a controlled environment. Another technique to determine face is by analysing the texture of images within a sequence. Texture analysis use brightness level of main features of face like eyes and lips from other facial components to detect face. According to Iwano et al.[5] the height and length of eyes and mouth are estimated and this helps in the formation of a rectangular box over the face. In [6], Kobayashi and Hara obtained the average brightness of the image and then calculated the base value that is the average brightness of images within a sequence of images. By using cross-correlation technique upon this base value they identify the position of the eyes. Once the position gets identified the whole face is identified by estimating the relative position of other features. Some other researchers used colour to detect face. In [7], Malassiotis et al. used colour sensors to predict depth information. 3D images were illuminated using structured light approach and accuracy of the system increased. In [8], Yang et al. used BPT (Binary Partition Tree) that are found from fuzzy membership function. Pixels were classified and watershed segmentation done to determine the skin region of the face. Valley detection and threshold of entropy were calculated to refine these skin regions. Pantic and Rothkrantz [9] used HSV colour model to determine the contour of the face and used histogram approach to detect the vertical and horizontal facial boundary of the head. They conducted the experiment on facial images that have dual view. Viola and Jones [10] worked on object detection in real time Haar wavelets were used for feature detection and adaboost algorithm combined with other algorithms further improved the system’s performance. The system did not allow rotations of the head. Pixels of the image were considered to design a rectangle through which features were extracted. In [1], to detect face Essa and Pentland created eigenface using PCA (Principal component analysis) on facial images and for image sequence spatial filtering and template filtering were done so as to get the face frame from image sequence. They allowed rotation of the head as well.
B. Information Extraction for emotion identification
Pantic and Rohtkrantz[9], use an expert system that works on two views of an image- frontal view and side view. From frontal view they extracted 19 facial points from 25 features and from side view they extracted 10 feature points using the curvature of the side view profile. Multiple feature detectors are used to determine main facial features like nose, mouth, eyes. Pantic and Rohtkrantz did not consider facial hair and glasses for feature analysis. Kobayashi and Hara[6], worked with still images captured through a monochrome camera. Still image was normalised in such a way that the eyes are 20 pixels apart. After estimating the distance between the eyes perpendicular lines length was obtained from the eyes. The brightness is also normalized and the estimated data is then fed to a neural network to determine the expression of image. The limitation of Haar and Kobayashi method is that the image should be just 1 meter away from the camera. This was some authors work on images but many researchers worked on the sequence of images to extract features and took the research on facial expressions to new heights. Essa and Pentland[1], use the eigen space approach to determine the position of various vital parts of face like eyes, mouth, nose. Feature space was identified using PCA from which eigen vectors analysed to calculate the position of each element of face. Essa and Pentland then calculated the distance of each image with the feature space. Optical flow computation method is used according to which mean is calculated as a velocity vector representing normalised form of images in image sequence. Covariance matrix between frames is calculated and used with Kalman filter to predict the motion vector field of face in frontal view of image sequence. Ohya and Otsuka[25], also applied the optical flow algorithm and then used the Fourier transformation technique to calculate the velocity vector field for both horizontal and vertical directions of face image. A fifteen dimensional feature vector is formed using the above calculations. Only symmetrical view of the face that is the right eye and mouth are considered as the method does not work for the left eye. Iwai et al.[11], identified nineteen points on face that are used for analysis out of which 7 points represent the overall contour of the face and 12 points are used to identify facial expression. The face points can be considered as nodes of a graph whose edges can be determined by calculating the Euclidean distance between the nodes. The weights of each edge represent some characteristics of facial features. Initially facial points for the first frame are considered and then these points are used to calculate points on other frames of image sequence. The whole system was designed in two layers. One is the memory layer and the second the input layer. Initial frame of the image sequence is in the memory layer and other frames of image sequence are inputed through input layer for matching to identify facial points. Black and Yacoob[24], used parameterised model for expression analysis from sequence of images. For rigid motions of face planar model is used. For non rigid motions affine curvature model is used that carefully examines local areas of eyes and mouth. Brightness and regression schemes are employed and frames stabalized. Here initially head and face features were manually selected and then automatically tracked. According to Cohn et al[23], the facial points are calculated for the first frame in an image sequence manually and optical flow approach then applied to determine the facial points on other frames of image sequence. If any change in the position of facial points is detected within the frames of the image sequence then it gets subtracted from the normalized positions of the facial points of the first frame. Initial frames and target frames of image sequence are used to calculate the change vectors. These change vectors are then used to determine the expression of image sequence.
C. Emotion Classification
With the increasing importance of better human machine interaction Facial Expression has become the research topic of many existing and current researchers. In [1], Essa and Pentland used posed dataset and extracted feature using 3D model based on motion and muscle. To classify Facial Expressions they used Euclidean normalisation classifier. In [11], Wang and Iwai used posed data of eight subjects and extracted features using B-spline curve model which is geometry based feature extraction model. They used Euclidean normalisation classifier to classify expressions. In [12], Chang el al. used geometry based feature extraction approach based on active shape variation model. They used probabilistic classifier model for expression detection. In [13], Valstar and Pantic used affine transform to detect facial geometry dynamics on MMI and Cohn-Kanade dataset. They have used support vector machine as classifier. In [14], Patras and Pantic used rule based classifier to detect fifteen facial points in MMI [15] dataset. In [16], Donato et al. used appearance based gabor filter wavelets to extract features from twenty four posed dataset and determined non dynamic expressions using nearest neighbour classifier. In [17], Pietikainen and Zhao extracted features using appearance based LBP approach in cohn-Kanade database and transform in-plane image. The dynamic features were detected using SVM classifier. In [18], Littlewort et al. used gabor filter wavelets to extract features from CK dataset and determine non dynamic features using SVM. In [19], Wu made use of motion energy based Gabor as a feature extraction technique and determined expressions from Cohn Kanade database using support vector classifier. In [20], Tian et al. used both geometric and transient feature extraction method to determine features from cohn kanade database and determined expressions using neural networks. In [21], Matthews et al. used active appearance based model to determine expressions on acted data using nearest neighbour classifier. It could work on both 2D and 3D shape models. In total 20 subjects were studied. In [22], Zhou and Torre worked on a hybrid technique for feature extraction and used multi-dimensional assignment classifier algorithm to detect expressions.
III. ACKNOWLEDGMENT
I would like to express my gratitude to Ms Shilpi Gupta and Mr. Abhishek Singhal , Faculty Computer Science & Engineering Department Amity University, for their guidance , proofreading and help in completing this work successfully.
REFERENCES
[1] Essa and A. Pentland, ‘Coding, analysis, interpretation, and recognition of facial expressions,’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 19,no. 7, pp. 757’763, Jul. 1997.
[2] Froba, B., & Kublbeck C. ‘Robust face detection at video frame rate based on edge orientation features.’ Proceedings of the International Conference on Automatic Face and Gesture Recognition, 327-332, 2002.
[3] Suzuki, Y., & Shibata T. ‘Multiple-clue face detection algorithm using edge-based feature vectors.’ Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 5, V-737-40, 2004.
[4] C.L. Huang and Y.M. Huang, ‘Facial Expression Recognition Using Model-Based Feature Extraction and Action Parameters Classification,’ Jouranal of Visual Communication and Image Representation, vol. 8, no. 3, pp. 278-290, 1997.
[5] M. Yoneyama, Y. Iwano, A. Ohtake, and K. Shirai, ‘Facial
Expressions Recognition Using Discrete Hopfield Neural Networks,’ International Conference of Information Processing, vol. 3, pp. 117-120, 1997.
[6] H. Kobayashi and F. Hara, ‘Facial Interaction between Animated 3D Face Robot and Human Beings,’ Proc. International Conference Systems, Man, Cybernetics,, pp. 3,732-3,737, 1997.
[7] F. Tsalakanidou, S. Malassiotis, M.G. Strintzis ‘Face localization and authentication using color and depth images.’ IEEE Transactions on Image Processing 14 (2) 152’16, 2005.
[8] Z. Liu, J. Yang, N.S. Peng ‘An efficient face segmentation algorithm based on binary partition tree.’ Signal Processing: Image Communication Vol 20, 2005.
[9] M. Pantic and L.J.M. Rothkrantz, ‘Expert System for Automatic Analysis of Facial Expression,’ Image and Vision Computing J., vol. 18, no. 11, pp. 881-905, 2000.
[10] P. Viola and M. Jones, ‘Robust real-time object detection,’ Cambridge Research Laboratory Technical report series, pp. 1’30, February 2001.
[11] M. Wang, Y. Iwai, and M. Yachida, ‘Expression recognition from timesequential facial images by use of expression change model,’ in Proc. IEEE Int. Conf. Autom. Face Gesture Recog., Apr. 1998, pp. 324’329.
[12] Y. Chang, C. Hu, R. Feris, andM. Turk, ‘Manifold based analysis of facial expression,’ Image Vis. Comput., vol. 24, no. 6, pp. 605’614, Jun. 2006.
[13] M. Pantic and M. F. Valstar, ‘Combined support vector machines and hidden Markov models for modeling facial action temporal dynamics,’ in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. Workshop Human Comput. Interact., Oct. 2007, pp. 118’127.
[14] I. Patras and M. Pantic, ‘Dynamics of facial expression: Recognition of facial actions and their temporal segments from face profile image sequences,’ IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 36, no. 2, pp. 433’449, Apr. 2006.
[15] M. Valstar, M. Pantic, R. Rademaker, and L. Maat, ‘Web-based database for facial expression analysis,’ in Proc. IEEE Int. Conf.Multimedia Expo., 2005, p. 5.
[16] G. Donato, J. Hager, P. Ekman, M. Bartlett and T. Sejnowski, ‘Classifying facial actions,’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 10, pp. 974’989, Oct. 1999.
[17] M. Pietik??inen and G. Zhao, ‘Dynamic texture recognition using local binary patterns with an application to facial expressions,’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 6, pp. 915’928, Jun. 2007.
[18] G. Littlewort, M. Bartlett, M. Frank, C. Lainscsek, I. Fasel, and J. Movellan, ‘Fully automatic facial action recognition in spontaneous behavior,’ in Proc. IEEE Int. Conf. Autom. Face Gesture Recog., 2006, pp. 223’230.
[19] M. Bartlett, T. Wu, and J. Movellan, ‘Facial expression recognition using Gabor motion energy filters,’ in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog. Workshop Human Commun. Behav. Anal., Jun. 2010, pp. 42’47.
[20] Y.-I. Tian, T. Kanade, and J. Cohn, ‘Recognizing action units for facial expression analysis,’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 2, pp. 97’115, Feb. 2001.
[21] I. Matthews, S. Lucey, C. Hu, Z. Ambadar, F. de la Torre, and J. Cohn, ‘AAM derived face representations for robust facial action recognition,’ in Proc. IEEE Int. Conf. Autom. Face Gesture Recog., 2006, pp. 155’160.
[22] F. Zhou, F. De la Torre, and J. Cohn, ‘Unsupervised discovery of facial events,’ in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Jun. 2010, pp. 2574’2581.
[23] J.F. Cohn, A.J. Zlochower, J.J. Lien, and T. Kanade, ‘Feature-Point Tracking by Optical Flow Discriminates Subtle Differences in Facial Expression,’ Proc. Int’l Conf. Automatic Face and Gesture Recognition, pp. 396-401, 1998.
[24] M.J. Black and Y. Yacoob, ??Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion,?? lnt’l J. Computer Vision, vol. 25, no. 1, pp. 23-48, 1997.
[25] T. Otsuka and J. Ohya, ‘Recognition of Facial Expressions Using HMM with Continuous Output Probabilities,’ Proc. Int’l Workshop Robot and Human Comm., pp. 323-328, 1996.
[26] M. Pantic and L. J. M. Rothkrantz, ‘Automatic analysis of facial expressions: The state of the art,’ IEEE Transactions on Pattern analysis and machine intelligence, vol. 22, pp. 1424’1445, Dec. 2000.
[27] Adekunle Micheal Adeshina,Siong-Hoe Lau and Chu-Kiong Loo, ‘Real-Time Facial Expression Recognitions: A Review,’ Conference on Innovative Technologies in Intelligent Systems and Industrial Applications, July 2009

Essay: Expression Analysis from sequence of images – a review

Essay details and download:

Text preview of this essay:

About this essay:

Essay details and download:

Text preview of this essay:

About this essay:

Essay Categories: