Essay: Face tracking

Essay details:

  • Subject area(s): Computer science essays
  • Reading time: 22 minutes
  • Price: Free download
  • Published on: July 24, 2019
  • File format: Text
  • Number of pages: 2
  • Face tracking
    0.0 rating based on 12,345 ratings
    Overall rating: 0 out of 5 based on 0 reviews.

Text preview of this essay:

This page of the essay has 4182 words. Download the full version above.

Abstract: – Face Tracking is an important & challenging research area in the field of computer vision & image processing. Tracking is the method of identification, the position of the moving object in the video. Identifying the face is much more challenging task & then tracking the facial features in a moving video. Facial features track can be eyes, nose and mouth. In this paper it describes to track the face & then track the facial features in the video surveillance systems. Basically, it is also used to improve the image and video quality and main purpose is of security.


Tracking is the problem of estimating the trajectory of an object in the image plane as it moves around a scene [1]. Object tracking is an important task with in the field of computer vision. Object tracking is the problem of actual the position and other relevant information of moving objects in image sequence. Object Tracking involves the detection of readable moving objects in a frame of a video sequence and tracking of such objects in the one after other frames [2]. Object tracking is an essential task in computer vision, with applications in video surveillance, human-machine interfaces and robot perception. In general, object tracking requires high accuracy and involves a lot of transcendental functions and high-precision floating point operations, so the target tracking technology used in the industry is mostly based on CPU or GPU software programming [3]. The tracking algorithm predicts the future positions of multiple moving objects according to the historical locations and the current vision features. Object tracking system has been applied in a great number of fields, such as the human-computer interaction, the security and surveillance, and the develop reality [4].

Now-a-days, due to different security reasons these surveillance systems became more famous and required. A large collection of video applications desire objects to be detected recognized and tracked in a distinct scene in order to extract semantic information about scene activity and human behavior [5]. Facial Features tracking is a crucial problem in computer vision due to its wide range of applications in Psychological facial expression analysis and human computer interfaces. Today advances in face video processing and compression have made face-to face communication be practical in real world applications. And after decades, robust and realistic real time face tracking still poses a big challenge. The difficulty lies in a number of concerns including the real time face feature tracking under a variety of imaging conditions (e.g., skin color, pose change, self-occlusion and multiple non-rigid features deformation). In this paper, we focus our work on facial feature tracking. To detect the face in the image, we have used a face detector based on the Haar like feature. This face detector is fast and powerful to any illumination condition [6].

Face detection is a computer technology that analysis human faces in digital pictures. Face detection may also refer to the mental process by which humans locate visual scenes in real life. Face detection is a distinct case of class-object detection. In class-object detection, the main task is to look for the sizes and locations of all objects in a picture that are from a given class. Some examples include torsos, cars, and pedestrians. Face-detection algorithms focus on finding of human faces [7]. It typically determines the proper face area. The human face plays vital role in social interactions and conveying one’s identity. It is a relevant early step in many computer systems and face processing system. Some modern applications where face detection is being used are autofocus cameras, practical security systems, people counting system, lecture attendance system etc. The idea of face detection can be combined with almost every smart system as face being a biometric identifier they ensure a superior protection and not affected to intrusions.

Fig: Proposed algorithm of the face track

In this paper, attention is focused in detecting and tracking facial features in videos. The way is designed to combine face detection with tracking algorithm. Tracking entirely front views of people faces means tracking moving objects coming toward the direction of the camera observing them. These show unidirectional patterns of people trajectory and occur in restricted scenarios such as in corridors. However, while a person moves across the camera field of view, both the person front and profile view of his face are fairly to be captured [8].


(a) Eye Tracking:- Eye-tracking techniques exist that rely on calculating electrical potentials generated by the moving eye (electro-oculography) or a metal coil in a magnetic field, such methods are relatively cumbersome and uncomfortable for the subject (e.g., because electrodes have to be attached to the head or a coil has to be placed on the cornea). A new generation of eye trackers is now available based on the invoice recording of images of a subject’s eye using infrared sensitive video technology and relying on software processing to figure out the position of the subject’s eyes relative to the head. Since these trackers are video based, there is no need for direct contact with the subject’s eyes, making these trackers much more convenient for routine eye movement recording during longer sessions. By bringing together the information on eye position with a measure of head position, evaluation of gaze position on a display can be obtained, allowing the formation of gaze-dependent displays [9].

(i)Eyelink Gaze tracker:- The Eyelink Gazetracker (SR Research Ltd., Mississauga, Ontario, Canada) is one of these video-based eye trackers and is used in such research fields as psychology, ophthalmology, neurology, and ergonomics. The Eyelink uses two high-speed cameras (CCD sensors) to track both eyes together. A third camera (also a CCD sensor) tracks four infrared markers mounted on the visual incentive display, so that head motion can be calculated and gaze position can be computed. The cameras produce images at a sampling rate of 250 Hz (4- msec temporal resolution). The Eyelink is used with a PC with dedicated hardware for doing the image processing necessary to resolve gaze position [9].

Fig: Eyegazeing

Akhilesh Raj, Abishalini Sivaraman, Chandreyee Bhowmick, Nishchal K. Verma proposed problem of computer vision based tracking of moving object in this various video datasets have been considered and the most optimal algorithm has been selected for tracking the moving target, based on different features of the videos. In this they, we have developed three algorithms using three different object detection algorithms, namely background subtraction, template matching and Speeded up Robust Features (SURF) [10].

Ahmad Delforouzi and Marcin Grzegorzek In this proposed, a new structure of SURF-based object tracking is proposed which uses a train-based matching to address the challenging object tracking in 360-degree videos. The proposed tracker is able to evaluate out-of-plane rotation and occlusion during the tracking and adapt itself to handle it [11].

Liu Wancun1 Tang Wenyan, Zhang Liguo, Zhang Xiaolin, Li Jiafu1 propose a novel multi-scale behavior learning approach to analysis the motion pattern of object’s location and size. Experimental results validate that the proposed method reduces the IDS significantly and improves the performance considerably [12].

Diego Ayala, Danilo Chavez proposed the development of an integrated vision system for detection, location, and tracking of a color object, it makes use of a microprocessor to get image data, process, and perform actions according to the interpreted data [13].

Weisheng Li, David Powers proposed tracking multiple objects can be dynamic by using motion vectors extracted from compressed video. The system applies statistical and clustering techniques on motion vectors to track multiple objects in real video [14].

Zhangping he, zhendong zhang, and cheolkon jung we propose fast Fourier transform networks for object tracking called FFTNet. FFTNet for object tracking based on CF. We have taken full advantage of CF, because CF have high computational performance and competitive performance, into FFTNet [15] .

Zhu Teng, Junliang Xing, Qiang Wang, Congyan Lang, Songhe Feng, Yi Jin in this they they bring together the combination of the Temporal Net and the Spatial Net which is effective in object tracking and it is estimated on four benchmarks including OTB50, OTB100, VOT2014, and VOT2016 [16].

Yoanes Bandung, Kusprasapta Mutijarsa, Luki B. Subekti we propose an assimilation of object tracking technology within video conference system. This assimilation aims to provide better captured video content which can be automatically focused on key objects or individuals in a learning activity such as whiteboard, teacher or students. This system can disqualify the need of camera operator and improve quality of distance learning service [17].

B.Maga, Mr. K.Jayasakthi Velmurgan resolves the various methods in static and moving object detection as well as tracking of moving objects. A new proposed way is provided for efficient object tracking using Kernel and feature based tracking methods. Kernel and feature based methods works better for detection in multiple objects [18].

Sun Xiaoyan, Chang Faliang proposed an adapted particle filter tracker with online learning and inheriting selective model Feature-learning and feature inheriting help particle filter improve the efficiency and robustness of tracking. This method can track the target quickly and accurately [19].

Wei Han, Guang-Bin Huang, Dongshun Cui propose a graph learning-based tracking framework to handle object deformation occlusion in this deform show that this method can improve tracking robustness under large deformation and occlusion and better the state-of-the-art algorithms. In this paper algorithm optimizes the graph similarity matrix until two disconnected sub graphs divide the foreground and background nodes [20].

Asti Putri Rahmadini, Prima Kristalina, and Amang Sudarsono proposed finger method which is used for mapping the observation area, and it is suitable for indoor application such as crowd sources tracking moving object in indoor environment. In this KNN algorithm is used for finger method to improve the accuracy moving position [21].

Wenming cao, yuhong li, zhihai he, guitao cao, and zhiquan he proposes a short-term tracking method, which is more robust than ordinary methods for single-target tracking under occlusions. They estimate weight-based key points matching tracker for occlusion, in which we apply the geometrical similarity to supplement virtual key points and fuzzy logic to estimate the occluded degree [22].

Rani Aishwarya S N, Vivek Maik , Chithravathi B proposed novel approach where the KCF filter is built-up by integrating it with Kalman filter. The integrated Kalman based KCF (KKCF) tracker is better than traditional KCF by performing well for outlier or failure cases which is corrected through Kalman filter. The main aim was to track moving objects more exactly and faster when compare the other approaches [23].

Miaobin Cen and Cheolkon Jung proposes multiple form of local orientation plane (Comp-LOP) for object tracking. This proposed approach outperforms state-of-the-art tracker on large benchmark data sets. This method bring outs good performance in object tracking [24].

Xuebai Zhang, Shyan-Ming Yuan propose three key advertising elements (product, brand and endorser) were tracked, presented by three eye movement indicators (transformed fixation time (TFT), transformed fixation number (TFN), and average gaze duration (AGD)). The indicated that three items are related to attitude toward ad (product related AGD, brand-related AGD and endorser-related TFT), attitude toward brand (brand-related TFN and AGD, endorser-related TFT), and purchase intention (product related AGD, brand-related TFN and endorser-related TFN).However, only two items of them are related to recall (product-related AGD and brand-related TFN) The data were access from 61 participants, each stimulated by six video ads, via an eye tracking method and questionnaires [25].

Subarna Tripathi and Brian Guenter present a novel, automatic eye gaze tracking scheme inspirited by smooth pursuit eye motion while playing mobile games or watching virtual reality contents. Our algorithm constantly improves an eye tracking system for a head mounted display. The algorithm comparison between corneal motion and screen space motion, and uses these to achieve Gaussian Process Regression models. A combination of those models provides a constantly mapping from corneal position to screen space position. Accuracy is nearly as good as generate with an accurate calibration step [26].

Ramona-georgiana vanghele, dumitru stanomir this paper target only on a small part of this research area, namely the gaze detection, using the properties of images and video sequences such as brightness, contrast, RGB colors representation. The method used techniques of contour sharpening and selecting elements in the image[27].

Ashwani K. Thakur, Tejan Gunjal, Aditya Jawalkar ,Aparna More in this paper they approaches to make our eye to control the cursor as an application to be used as what one can called as virtual mouse and can be used to perform many more applications by using viola Jones algorithm. This system provides fast and real time results with the accuracy of eye tracking were found to be approximately one degree of visual angle. It is specially designed for the Handicap people to use Computer or for controlling the wheelchair [28].

Dionis A. Padilla, Joseph Aaron B. Adriano, Jessie R. Balbin, implementing an eye tracking system on a Field Programmable Gate Array (FPGA) and specifically a text-typing application. Using state machines implemented in Verilog – a hardware description language and MATLAB for verification purposes, the said eye tracking system was built. The study comprises of several algorithms such as comprising of several processes such as Thinning Algorithm and Hough Transform was accomplished [29].

Kang Wang and Qiang Ji proposes a 3D model-based gaze estimation method with a single web-camera, which enables fast and portable eye gaze tracking. The key idea is to advantage on the proposed 3D eye-face model, from which we can determine 3D eye gaze from observed 2D facial landmarks. The proposed system includes a 3D deformable eye-face model that is learned offline from multiple training subjects. A real time eye tracking system running at 30 FPS also validates the effectiveness and ability of the proposed method [30].

Alexandru Pasarica, Radu Gabriel Bozomitu and Hariton Costin, Casian Miron, Cristian Rotariu presents the selection method implemented and the analysis of the precision of this method using a testing interface implemented in Matlab that allows the quantification of clicks/selections for different areas of the screen and the selection time required. In this tool used is human-computer interface, and dwell time which present the selection method which implemented and analysis [31].

Radu Jianu,and Sayeed Safayet Alam This paper introduces the DOI approach, and makes
necessary contributions towards applying it in practice. Three concrete examples of novel eye-tracking experiments enabled by the DOI approach in distinct domains (computer science, architecture instruction, and construction safety) using three different types of interactive visual content (2D, 3D, HTML). A formal DOI data model that builds on the generic EAV (entity-attribute-value) model, exemplified in the context of the three applications. A formal range of possible and probable questions that can be asked of DOI data [32].

Chandrika K R, Amudha J and Sithu D center on understanding the visual attention of subjects with programming skills and subjects without programming skills and recognize the eye tracking traits required for source code review. The key aspects of subjects with programming skills while source code review are required to have certain eye tracking features like better code coverage, attention span on error lines and comments [33].

Qing Mi, Jacky Keung, Jianglin Huang, Yan Xiao To bridge the research gap, we design an factual experiment in which eye tracking technology is introduced to quantitatively reflect developers’ cognitive efforts and mental processes when encountering the inconsistency issue. In this tool used is programming style, stylistic inconsistency, eye tracking technology, code readability, program comprehension [34].

Wen-Chung Kao, Jui-Che Tsai, and Yi-Chin Chiu proposes a parallel computing architecture for realizing a high precision algorithm on multicore microprocessor. The empirical result shows the proposed architecture could be applied to the design of a high speed VLGT with frame rate higher than 700 frames/s [35].

Jiancheng Zou and Honggen Zhang;Tengfan Weng a new approach is used which is combination image gradient information with threshold segmentation. Gradient detection and threshold segmentation are carried out in the region of interest, and the pupil and reflection spot are extracted directly. This paper use the centroid method to calculate the center coordinates more accurately. The algorithm used to develop human eye tracking system to achieve real-time eye tracking, while ensuring accuracy [36].

Anjith George and Aurobinda Routray The proposed method uses geometrical features of the eye. In the first stage, a fast convolution based approach is used for obtaining the coarse location of iris centre (IC). The IC location is further refined in the second stage using boundary tracing and ellipse fitting. The algorithm has been estimate in public databases like BioID, Gi4E and is found to outperform the state of the art methods [37].

(b) Nose Tracking: – The nose feature is defined as the point on the nose surface that is the closest to the camera. This point is termed the tip of the nose. Due to the symmetry and the convex shape of the nose, the nose feature is always visible in the camera, and it stays almost the same during the rotations of the head. It also does not change much with head moving towards and from the camera. Thus, the nose tip defined above cans always be located. This is very a important property of the nose which does not hold for any other facial feature [38].

Shadman Sakib Khan, Md. Samiul Haque Sunny, M. Shifat Hossain, Eklas Hossain, and Mohiuddin Ahmad Human Computer Interface (HCI) is implemented with digital image processing and a new method to control personal computers with high efficiency. The main three characteristics of this interface are nose tracking cursor control, auto brightness control and display control based on the presence and detection of valid human face. The proposed system is low cost and display inherent security and power saving capabilities [39].

Weiwei Zhang and Yi L.Murphey, Tianyu Wang, Qijie Xu proposes a yawning detection system that consists of a face detector, a nose detector, a nose tracker and a yawning detector. Deep learning algorithms are refining for detecting driver face area and nose location. A nose tracking algorithm that combines Kalman filter with a dedicate d open-source TLD (Track-Learning-Detection) tracker is developed to achieve robust tracking results under dynamic driving conditions [40].

S. Waphare, D. Gharpure and A.Shaligram, B. Botre propose the implementation of 2 novel algorithms named Surge-spiralx and Surge-castx on sniffer robot for odor plume tracking in a laminar wind environment. The algorithms are developed and they have shown very good performance in terms of success ratio, while Surge-Spiralx algorithm having less distance overhead[41].

Martin B¨ohme, Martin Haker, Thomas Martinetz, and Erhardt Barth propose a facial feature detector for time-off light n(TOF) cameras that last previous work by combining a nose detector based on geometric charters with a face detector. The goal is to prevent false detections outside the area of the face. They used a very simple classifier based on an axis-aligned bounding box in feature space; pixels whose feature values fall within the box are classified as nose pixels, and all other pixels are classified as “non-nose” [42].

(c) Mouth Tracking: -The term mouth tracking is used to include lip tracking as; the lips are a component of the mouth which contains other vital cues describing the mouth (ie. tongue, teeth, oral cavity). The lips however, act as an invaluable feature for tracking the mouth as in many cases the labial area gives a very good line of demarcation between the mouth and the face background [43].

Chris Fortuna, Christophe Giraud-Carrier and Joshua West proposed HTM algorithm which is designed to run continuously, actively counting bites throughout the day. They take a novel machine learning approach to customize the system to each individual user, and generate an average accuracy of 91.8%, well above the current state-of-the-art. They used us small set of 5 motion features, a Na¨ıve Bayes model [44].

Sunil S. Morade and B. Suprava Patnaik proposes a novel active contour guided geometrical feature extraction approach for lip reading. Three active contour approaches are snake, region scalable fitting energy method and localized active contour model. These approaches are adopted for salient geometrical feature calculation. A joint feature model, obtained by combinatining inner area, height and width has been proposed [45].

Luca Cappelletta and Naomi Harte a semi-automatic system based on nostril detection is presented. The system is creating to work on ordinary front videos and to be able to recover brief nostril occlusion. Using the nostril position a motion compensated Accumulated Difference Image (ADI) is generated. This ADI is less noisy than the non-compensated one, and this leads to better mouth region tracking [46].

Jie Cheng, Peisen Huang proposes a novel approach for real-time mouth tracking and 3D reconstruction. This method comprises two successive processing stages. In the first stage, an AdaBoost learning algorithm and a Kalman filter are used to detect and track the mouth region in real-time under a complex background. In the second stage, the resultant 2D position of the mouth is used to determine the region where the 3D shape is reconstructed by use of a digital fringe projection and modified Fourier transform method. The main grant of this paper is the real-time dense 3D reconstruction of the mouth region, which can be useful in many applications, such as lip-reading, biometrics, 3D animation, etc [47].

Zhilin Wu, Petar S. Aleksic, and Aggelos K. Katsaggelos propose approach consisting of a Gradient Vector Flow (GVF) snake with a parabolic template as an additional external force is proposed. Based on the results of the outer lip tracking, the inner lip is tracked using a similarity function and a temporal smoothness constraint [48].


Viola Jones Algorithm
Viola-Jones Algorithm is based on analysis the input image by means of sub window capable of detecting features. This window is extending to detect faces of distinct sizes in the image. Viola Jones developed a scale invariant detector which runs through the image many times, each time with particular size. Being scale invariant, the detector requires same number of calculations regardless of the size of the image. The system architecture of Viola Jones is based on a cascade of detectors. The first stages consist of simple detectors which ignore only those windows which do not contain faces. In the following stages the complication of detectors are increased to analysis the features in more detail. A face is detected only if it is observed through the entire cascade.

Viola-Jones face detection algorithm searches the detector several times through the same image – each time with a new size. The detector detects the non face area in an image and discards that area which results in detection of face area. To reject non face area Viola Jones take advantage of cascading.

The Viola–Jones algorithm is calculated for real – time detection of faces from an image. Its real – time performance is collect by using Haar type features, computed rapidly by using integral images, feature selection using the AdaBoost algorithm (Adaptive Boost) and face detection with conditional cascade.[49]

A. Feature calculation
Starting from the common features of the faces, such as the region around the eyes is darker than the cheeks or the region of the nose is brighter than those of the eyes, five Haar masks were chosen for estimating the features, measured at different positions and sizes. Haar features are calculated as the difference between the sum of the pixels from the white region and the sum of the pixels from the black. In this way, it is possible to detect contrast differences.

Type1 Type2 Type3 Type4 Type5

Figure: – Haar masks used

Fig: – Type 2 Haar features from which the intensity difference between the pixels from eyes region and the cheek region can be observed

If we consider the mask M from Fig., the Haar feature correlated with the image I behind the mask is defined by:
∑_(1≤i≤N)▒∑_(1≤j≤N)▒〖I(〖i,j)〗_white-I(〖i,j)〗_black 〗
The characteristic are extracted for windows with the dimensions of 24×24 pixels, which are moved on the image where we want to detect faces. For such a window, Haar masks are scaled and moved, resulting 162,336 of features. To reduce the computation time of the Haar features, which vary depending on the size and type of the feature, the integral image was used. In figure is illustrated how from an original image is obtained the integral one and how is computed the sum of pixels within a rectangle region using integral image. Computation from the region D using integral image [49].

For the location (i, j), the integral image II contains the sum of the pixels above and to the left of (i, j), inclusive:

II(i,j)=∑_(1≤s≤i)▒∑_(1≤t≤j)▒I(s,t) ,1≤i≤N,1≤j≤N

The sum of the pixels within rectangle can be computed with four array references. The value of the integral image at location 1 is the sum of the pixels in rectangle A . The value at location 2 is A + C, at location 3 is A + B, and at location 4 is A + B + C + D . The sum within D can be computed as 4 + 1 –(2 + 3).

B. Feature selection using AdaBoost algorithm
As the number of Haar features for an image with 24 x 24 pixels is d = 162 336, and many of our redundant, AdaBoost algorithm was used to select a smaller number of features. The basic idea is to build a complex classifier (decision rule) using a weighted linear combination of weak classifiers. Every feature f is considered a weak classifier, defined by:

h(x,f,p,θ) = {█(1,if pf(x)<pθ@0,otherwise)┤

Kanade LucasTomasi(KLT) Algorithm
Kanade Lucas Tomasi algorithm is used for feature tracking. It is a most important one. KLT algorithm was introduced by Lucas and Kanade and their work was later extended by Tomasi and Kanade. This algorithm is used for detecting scattered feature points which have enough texture for tracking the required points in a good standard. Kanade-Lucas-Tomasi (KLT) algorithm is used here for tracking human faces constantly in a video frame. This method is accomplished by them finding the parameters that allow the reduction in dissimilarity measurements between feature points that are related to original translational model [50].

Once the face has been detected it’s the need of our system to track that detected face. For tracking of the detected face by using Viola-Jones algorithm we will be using another and most widely used algorithm known as Kanade-Lucas-Tomasi (KLT) algorithm. Kanade, Lucas and Tomasi paved the way for face tracking by introducing KLT algorithm which tracks face based on feature points of the detected face. Viola Jones can track the face but it is computationally costly Viola Jones also cannot track the tilted face and if the face is turned [50].

Once the face detector locates the face KLT tracks the feature points in the video frames which ensure exactly tracking of the face. The working flow of KLT is, first detects the face, after detection of face it identifies the facial features for tracking, then it initializes tracker for tracking those points and finally it tracks the face. The points can be lost due to lighting conditions, out of the plane rotation or articulated motion. To track points for the long period of time we need to regain points periodically [50].

Cascade object detector locates the face in the video frame. Trained classification model and Viola Jones algorithm is by default used by cascade object detector to detect faces in input video frame. The input gives to face detector is a video frame. The face detector reads the video frame and runs the Viola-Jones face detection algorithm and plots the bounding box values around the face. Bounding box contains four values the x coordinate, y coordinate, length, and breadth we will be using the rectangle to plot this points around the face. The bounding box may contain different values for the different shape, according to shape one might use to plot around the object. Now further it tracks the feature points [50].

Points=detectMinEigenFeature(rgb2gray(videoFrame),’ROI’, box), it converts the given video frame into grey scale image and gives this to detectMinEigenFeatures function. DetectMinEigenFeatures function uses minimum Eigen value algorithm to find feature points. Then it shows these points in the input video frame and plot those detected points on the face. A point is an object which contains information about feature points detected in 2D grey scale input image. Then it initializes the tracker to track the points. vision.PointTracker uses KLT to track feature points [50].

The Iterative Lucas- Kanade Scheme:-
Location of point on image
U^L U/2^L

Spatial gradient matrix
G=∑▒[■(I_x^2&I_x&[email protected]_x&I_y&I_y^2 )]

Standard Lucas – kanade scheme for flow computation at level

L d^L
Guess for next pyramid level L-1
g^(L+1) = 2(g^█([email protected] )+d^L)
Finally, d = d^0+g^0
V = U + d


The method is based on face geometrical configuration. A face contains eyebrows, eyes, nose and mouth; a face image is sym-metric in the left and right directions; eyes are below two eyebrows; nose lies between and below two eyes; lips lie below nose; the contour of a human head can be near by an ellipse, and so on. By using the facial components as well as positional relationship between them we can locate the faces easily [51]. These real-time tracking techniques can be used to build non-intrusive vision-based user interfaces. We have used the described tracking techniques to build a system that estimates a user’s head pose, and to obtain the visual elgation for an eye-gaze tracker and a lip-reading system. These applications will be described below [52].

(i) Head Pose Tracking: – A person’s gaze direction is determined by two factors: the orientation of the head and the orientation of the eyes. But whereas the eye orientations determine the exact direction of the user’s gaze, the head orientation determines the overall gaze direction. Since we know the geometry of a face, estimating the orientation of the head is a pose estimation problem. In fact, the head pose can be estimated by finding correspondences between a number of head model points and their locations in the camera image. We have developed a system to generate the head pose using a full perspective model . The system tracks six non-coplanar facial features (eyes, lip-corners and nostrils) in real-time and generate the head pose using an algorithm proposed by DeMenthon & Davis [52].

(ii) Eye-Gaze Monitoring:-. The eyegaze mointoring system is a communication system which is very useful for the blind persons with the help which they can perform their daily activities by using such a monitoring system. Eyegaze System is a direct-select vision controlled communication and control system. Eyegaze Systems are being used in homes, offices, schools, hospitals, and long-term care facilities. By looking at control keys displayed on a screen, a person can synthesize speech, control his environment (lights, appliances, etc.), type, operate a telephone, run computer software, operate a computer mouse and access the internet and e-mail. Eyegaze Systems are being used to write books, attend school and enhance the quality of life of people with disabilities all over the world. Eyegaze Systems are being used to write books, attend school and enhance the quality of life of people with disabilities all over the world [53].

(iii) Lip Reading: – It has been establish that visual information can enhance accuracy of speech recognition for both a human and a computer. However, many other lip reading systems require a user keep still or put special marks on his/her face [54].

Year Author Method Application area

2018 Xuebai Zhang, TFN, AGD, TFT, Dynamic Video Advertising.
Shyan-Ming YuanAOI

2018 NiiLongdonSowah, Bayesian constraint and People with same color
Qingbo Wu Gaussian pattern classes of clothing, pose changes
algorithm. and exit and entry of

2018 NeilayKhasnabish, Particle filter Visual objects satisfactorily
Ketan P. Detrojaunder different scenarios With . less no. of particle

2018 Woo-Seok Yang, PID Algorithm Aerial photography.
Myung-Hyun Chun, (Proportion,Integral,
Gun-Woo Jang, Derivation)
Jong-Hwan Baek,
Sang-Hoon Kim

2017 Yoanes Bandung, TLD(Tracking learning Video conference.
KusprasaptaMutijarsa, detection)
Luki B. Subekti

2017 B.Maga, Kernel feature based Animation and painting
Mr. K.Jayasakthi tracking method video surveillance.

2017 Sun Xiaoyan Adaboost feature , Video surveillance,
Chang Faliang Multiple instance Gesture recognition.
Learning (MIL)

2017 SehrNadeem, Linear SVM and Lab Visual navigation,
AnisRahman, color Video surveillance.
Asad A. Butt

2017 Wei Han, Graph learning based Video surveillance,
Guang-Bin Huang, based tracking framework robotics and autonomous
Dongshun Cui driving

2017 SubarnaTripathi, Auto-Calibration Method Head mounted displays
Brian Guenter using smooth pursuit
Eye movement.

2017 R. G. Bozomitu, Pupil Detection Disabled neuromotor
A. Păsărică, patients, marketing,
V. Cehan,C. Rotariu, virtual reality, computer
H. Costingames, and data security.

2017 Dionis A. Padilla, Thinning Algorithm, Text typing
Joseph Aaron B. Adriano, Hough Transform
Jessie R. Balbin,

2017 Kang Wang QiangJi 3D model-based gaze Natural head method and
estimation methods real time eye gaze

2017 Ahmad Delforouzi SURF-based object 360 degree video.
MarcinGrzegorzek tracking

2017 Yan Li, Near-online multi-target ATC(Air traffic control)
XianbinCao , tracking(NOMT) surveillance video.
Junying Liu, framework
Baochang Zhang

2017 Weisheng Li, Mode reduction, Multiple object
David Powers K means tracking.

2017 ShadmanSakib Khan, Viola Jones algorithm Assistive Technologies
Md. SamiulHaque Sunny,
M. ShifatHossain,
Mohiuddin Ahmad

2017 Felipe Vera, Akori platform Web usage mining
V´ıctor D. Cort´es, (Advanced kernel for ocular
Gabriel Iturra, Research and web intelligence)
Juan D. Vel´asquez,
Pedro Maldonado,

2017 RaduJianu, Visual analysis model Pilot studies, preliminary
SayeedSafayetAlam data, and design and
requirement discussions.

2017 AlexandruPasarica, Circular Hough Neuro-motor disabilities.
Radu Gabriel Bozomitu transform(CHT)
HaritonCostin, Casian
Miron, CristianRotariu

2016 Akhilesh Raj, Unscented Kalman Filter, Video communication,
ChandreyeeBhowmick, SURF traffic control, medical
Nishchal K. Verma, imaging and human-
AbishaliniSivaraman machine interface.

2016 Chris Fortuna, Na¨ıve Bayes model Hand to mouth.
Christophe Giraud-Carrier,
Joshua West

2016 HSIAU WEN LIN, Haar-likes features Locating the position of
YI-HONG LIN each eye ball.

2016 Anjith George, RANSAC(Random Inner eye corner.
AurobindaRoutray sample consensus

2015 Weiwei Zhang, Kalman filter with a Diver yawing states.
Yi L.Murphey, dedicated open-source
Tianyu Wang, TLD(Track-Learning-
QijieXu Detection)

2015 Zhang Naizhong, YCbCr color space Hand free mouse
Wen Jing,
Wang Jun

2013 Sunil S. Morade, Region scalable fitting Biomedical application.
B. SupravaPatnaik energy(RCFE),Localized
active contourmodel
(LACM),Snake Method

2010 S. Waphare, Surge-spiralx and 3 nose strategy
D. Gharpure, Surge-castx
A.Shaligram, algorithms
B. Botre

2010 Jie Cheng, AdaBoost learning, Digital fringe projection.
Peisen Huang Kalman filter

2010 Luca Cappelletta, Kanade Lucas Tomasi The combination of a
Naomi Harte (KLT)Tracker motion compensated ADI
and nostril tracking gives
a smooth track of the
Speakers’ mouth.


MATLAB (matrix laboratory) is a multiparadigm numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, Java, and Fortran. Although MATLAB is intended primarily for numerical computing, an optional toolbox uses the MuPAD symbolic engine, allowing access to symbolic computing capabilities. An additional package, Simulink, adds graphical multi-domain simulation and Model-Based Design for dynamic and embedded systems. In 2004, MATLAB had around one million users across industry and academia. MATLAB users come from various cultures of engineering, science, and economics. MATLAB is widely used in academic and research institutions as well as industrial enterprises [55].


Computer Vision System Toolbox – provides algorithms, functions, and apps for the design and simulation of computer vision and video processing systems. You can perform object detection and tracking, feature detection and extraction, feature matching, stereo vision, camera calibration, and motion detection tasks. The system toolbox also provides tools for video processing, including video file I/O, video display, object interperation, drawing graphics, and compositing. Algorithms are available as MATLAB functions, System objects and Simulink blocks [55].

Key Features:
High-level language for numerical computation, visualization, and application development Interactive environment for iterative exploration, design, and problem solving Mathematical functions for linear algebra, statistics, Fourier analysis, filtering, optimization, numerical integration, and solving ordinary differential equations Built-in graphics for visualizing data and tools for creating custom plots.
Development tools for improving code quality and maintainability and maximizing performance.
Tools for building applications with custom graphical interfaces Functions for integrating MATLAB based algorithms with external applications and languages such as C, Java, .NET, and Microsoft Excel.

Applications of MATLAB:

(i). A very huge database of built-in algorithms for image processing and computer vision applications.
(ii). MATLAB allows you to test algorithms instantly without recompilation. You can type something at the command line or execute a section in the editor and immediately see the results, greatly facilitating algorithm development.
(iii). The MATLAB Desktop environment, which allows you to work interactively with your data, helps you to keep track of files and variables, and simplifies common programming/debugging tasks.
(iv). The capability to read in a wide variety of both common and domain-specific image formats.
(v). The capability to call external libraries, such as OpenCV.
(vi). Clearly written documentation with many examples, as well as online resources such as web seminars.
(v). Bi-annual updates with new algorithms, features, and performance enhancements.
(vi). If you are already using MATLAB for other purposes, such as simulation, optimation, statistics, or data analysis, then there is a very quick learning curve for using it in image processing.
(vii). The capability to process both still images and video.
(viii). Technical support from a well-staffed, professional organization
(ix). A large user community with lots of free code and knowledge sharing
(x). The capability to auto-generate C code, using MATLAB Coder, for a large subset of image processing and mathematical functions, which you could then use in other environments, such as embedded systems or as a component in other software.


The detection & tracking of faces & facial features in video surveillance is a fundamental and challenging in computer vision. This paper, describes object tracking & facial features such as eyes, nose & mouth in the video surveillance .This paper also includes the comparative study of the facial features and this applications in various fields. Future work includes pose estimation and 2D head tracking using the motion of the face with subsequent video frames and it can also enhance. The main challenge is to dwelling the robustness to changing environmental conditions, facial expressions and occlusions.

About Essay Sauce

...(download the rest of the essay above)

About this essay:

This essay was submitted to us by a student in order to help you with your studies.

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Face tracking. Available from:<> [Accessed 08-04-20].

Review this essay:

Please note that the above text is only a preview of this essay.

Review Title
Review Content

Latest reviews: