Essay: Face tracking

Essay details:

  • Subject area(s): Computer science essays
  • Reading time: 22 minutes
  • Price: Free download
  • Published on: July 24, 2019
  • File format: Text
  • Number of pages: 2
  • Face tracking Overall rating: 0 out of 5 based on 0 reviews.

Text preview of this essay:

This page of the essay has 3860 words. Download the full version above.

Abstract: – Face Tracking is an important & challenging research area in the field of computer vision & image processing. Tracking is the method of identification, the position of the moving object in the video. Identifying the face is much more challenging task & then tracking the facial features in a moving video. Facial features track can be eyes, nose and mouth. In this paper it describes to track the face & then track the facial features in the video surveillance systems. Basically, it is also used to improve the image and video quality and main purpose is of security.


Tracking is the problem of estimating the trajectory of an object in the image plane as it moves around a scene [1]. Object tracking is an important task with in the field of computer vision. Object tracking is the problem of actual the position and other relevant information of moving objects in image sequence. Object Tracking involves the detection of readable moving objects in a frame of a video sequence and tracking of such objects in the one after other frames [2]. Object tracking is an essential task in computer vision, with applications in video surveillance, human-machine interfaces and robot perception. In general, object tracking requires high accuracy and involves a lot of transcendental functions and high-precision floating point operations, so the target tracking technology used in the industry is mostly based on CPU or GPU software programming [3]. The tracking algorithm predicts the future positions of multiple moving objects according to the historical locations and the current vision features. Object tracking system has been applied in a great number of fields, such as the human-computer interaction, the security and surveillance, and the develop reality [4].

Now-a-days, due to different security reasons these surveillance systems became more famous and required. A large collection of video applications desire objects to be detected recognized and tracked in a distinct scene in order to extract semantic information about scene activity and human behavior [5]. Facial Features tracking is a crucial problem in computer vision due to its wide range of applications in Psychological facial expression analysis and human computer interfaces. Today advances in face video processing and compression have made face-to face communication be practical in real world applications. And after decades, robust and realistic real time face tracking still poses a big challenge. The difficulty lies in a number of concerns including the real time face feature tracking under a variety of imaging conditions (e.g., skin color, pose change, self-occlusion and multiple non-rigid features deformation). In this paper, we focus our work on facial feature tracking. To detect the face in the image, we have used a face detector based on the Haar like feature. This face detector is fast and powerful to any illumination condition [6].

Face detection is a computer technology that analysis human faces in digital pictures. Face detection may also refer to the mental process by which humans locate visual scenes in real life. Face detection is a distinct case of class-object detection. In class-object detection, the main task is to look for the sizes and locations of all objects in a picture that are from a given class. Some examples include torsos, cars, and pedestrians. Face-detection algorithms focus on finding of human faces [7]. It typically determines the proper face area. The human face plays vital role in social interactions and conveying one’s identity. It is a relevant early step in many computer systems and face processing system. Some modern applications where face detection is being used are autofocus cameras, practical security systems, people counting system, lecture attendance system etc. The idea of face detection can be combined with almost every smart system as face being a biometric identifier they ensure a superior protection and not affected to intrusions.

Fig: Proposed algorithm of the face track

In this paper, attention is focused in detecting and tracking facial features in videos. The way is designed to combine face detection with tracking algorithm. Tracking entirely front views of people faces means tracking moving objects coming toward the direction of the camera observing them. These show unidirectional patterns of people trajectory and occur in restricted scenarios such as in corridors. However, while a person moves across the camera field of view, both the person front and profile view of his face are fairly to be captured [8].


(a) Eye Tracking:- Eye-tracking techniques exist that rely on calculating electrical potentials generated by the moving eye (electro-oculography) or a metal coil in a magnetic field, such methods are relatively cumbersome and uncomfortable for the subject (e.g., because electrodes have to be attached to the head or a coil has to be placed on the cornea). A new generation of eye trackers is now available based on the invoice recording of images of a subject’s eye using infrared sensitive video technology and relying on software processing to figure out the position of the subject’s eyes relative to the head. Since these trackers are video based, there is no need for direct contact with the subject’s eyes, making these trackers much more convenient for routine eye movement recording during longer sessions. By bringing together the information on eye position with a measure of head position, evaluation of gaze position on a display can be obtained, allowing the formation of gaze-dependent displays [9].

(i)Eyelink Gaze tracker:- The Eyelink Gazetracker (SR Research Ltd., Mississauga, Ontario, Canada) is one of these video-based eye trackers and is used in such research fields as psychology, ophthalmology, neurology, and ergonomics. The Eyelink uses two high-speed cameras (CCD sensors) to track both eyes together. A third camera (also a CCD sensor) tracks four infrared markers mounted on the visual incentive display, so that head motion can be calculated and gaze position can be computed. The cameras produce images at a sampling rate of 250 Hz (4- msec temporal resolution). The Eyelink is used with a PC with dedicated hardware for doing the image processing necessary to resolve gaze position [9].

Fig: Eyegazeing

Akhilesh Raj, Abishalini Sivaraman, Chandreyee Bhowmick, Nishchal K. Verma proposed problem of computer vision based tracking of moving object in this various video datasets have been considered and the most optimal algorithm has been selected for tracking the moving target, based on different features of the videos. In this they, we have developed three algorithms using three different object detection algorithms, namely background subtraction, template matching and Speeded up Robust Features (SURF) [10].

Ahmad Delforouzi and Marcin Grzegorzek In this proposed, a new structure of SURF-based object tracking is proposed which uses a train-based matching to address the challenging object tracking in 360-degree videos. The proposed tracker is able to evaluate out-of-plane rotation and occlusion during the tracking and adapt itself to handle it [11].

Liu Wancun1 Tang Wenyan, Zhang Liguo, Zhang Xiaolin, Li Jiafu1 propose a novel multi-scale behavior learning approach to analysis the motion pattern of object’s location and size. Experimental results validate that the proposed method reduces the IDS significantly and improves the performance considerably [12].

Diego Ayala, Danilo Chavez proposed the development of an integrated vision system for detection, location, and tracking of a color object, it makes use of a microprocessor to get image data, process, and perform actions according to the interpreted data [13].

Weisheng Li, David Powers proposed tracking multiple objects can be dynamic by using motion vectors extracted from compressed video. The system applies statistical and clustering techniques on motion vectors to track multiple objects in real video [14].

Zhangping he, zhendong zhang, and cheolkon jung we propose fast Fourier transform networks for object tracking called FFTNet. FFTNet for object tracking based on CF. We have taken full advantage of CF, because CF have high computational performance and competitive performance, into FFTNet [15] .

Zhu Teng, Junliang Xing, Qiang Wang, Congyan Lang, Songhe Feng, Yi Jin in this they they bring together the combination of the Temporal Net and the Spatial Net which is effective in object tracking and it is estimated on four benchmarks including OTB50, OTB100, VOT2014, and VOT2016 [16].

Yoanes Bandung, Kusprasapta Mutijarsa, Luki B. Subekti we propose an assimilation of object tracking technology within video conference system. This assimilation aims to provide better captured video content which can be automatically focused on key objects or individuals in a learning activity such as whiteboard, teacher or students. This system can disqualify the need of camera operator and improve quality of distance learning service [17].

B.Maga, Mr. K.Jayasakthi Velmurgan resolves the various methods in static and moving object detection as well as tracking of moving objects. A new proposed way is provided for efficient object tracking using Kernel and feature based tracking methods. Kernel and feature based methods works better for detection in multiple objects [18].

Sun Xiaoyan, Chang Faliang proposed an adapted particle filter tracker with online learning and inheriting selective model Feature-learning and feature inheriting help particle filter improve the efficiency and robustness of tracking. This method can track the target quickly and accurately [19].

Wei Han, Guang-Bin Huang, Dongshun Cui propose a graph learning-based tracking framework to handle object deformation occlusion in this deform show that this method can improve tracking robustness under large deformation and occlusion and better the state-of-the-art algorithms. In this paper algorithm optimizes the graph similarity matrix until two disconnected sub graphs divide the foreground and background nodes [20].

Asti Putri Rahmadini, Prima Kristalina, and Amang Sudarsono proposed finger method which is used for mapping the observation area, and it is suitable for indoor application such as crowd sources tracking moving object in indoor environment. In this KNN algorithm is used for finger method to improve the accuracy moving position [21].

Wenming cao, yuhong li, zhihai he, guitao cao, and zhiquan he proposes a short-term tracking method, which is more robust than ordinary methods for single-target tracking under occlusions. They estimate weight-based key points matching tracker for occlusion, in which we apply the geometrical similarity to supplement virtual key points and fuzzy logic to estimate the occluded degree [22].

Rani Aishwarya S N, Vivek Maik , Chithravathi B proposed novel approach where the KCF filter is built-up by integrating it with Kalman filter. The integrated Kalman based KCF (KKCF) tracker is better than traditional KCF by performing well for outlier or failure cases which is corrected through Kalman filter. The main aim was to track moving objects more exactly and faster when compare the other approaches [23].

Miaobin Cen and Cheolkon Jung proposes multiple form of local orientation plane (Comp-LOP) for object tracking. This proposed approach outperforms state-of-the-art tracker on large benchmark data sets. This method bring outs good performance in object tracking [24].

Xuebai Zhang, Shyan-Ming Yuan propose three key advertising elements (product, brand and endorser) were tracked, presented by three eye movement indicators (transformed fixation time (TFT), transformed fixation number (TFN), and average gaze duration (AGD)). The indicated that three items are related to attitude toward ad (product related AGD, brand-related AGD and endorser-related TFT), attitude toward brand (brand-related TFN and AGD, endorser-related TFT), and purchase intention (product related AGD, brand-related TFN and endorser-related TFN).However, only two items of them are related to recall (product-related AGD and brand-related TFN) The data were access from 61 participants, each stimulated by six video ads, via an eye tracking method and questionnaires [25].

Subarna Tripathi and Brian Guenter present a novel, automatic eye gaze tracking scheme inspirited by smooth pursuit eye motion while playing mobile games or watching virtual reality contents. Our algorithm constantly improves an eye tracking system for a head mounted display. The algorithm comparison between corneal motion and screen space motion, and uses these to achieve Gaussian Process Regression models. A combination of those models provides a constantly mapping from corneal position to screen space position. Accuracy is nearly as good as generate with an accurate calibration step [26].

Ramona-georgiana vanghele, dumitru stanomir this paper target only on a small part of this research area, namely the gaze detection, using the properties of images and video sequences such as brightness, contrast, RGB colors representation. The method used techniques of contour sharpening and selecting elements in the image[27].

Ashwani K. Thakur, Tejan Gunjal, Aditya Jawalkar ,Aparna More in this paper they approaches to make our eye to control the cursor as an application to be used as what one can called as virtual mouse and can be used to perform many more applications by using viola Jones algorithm. This system provides fast and real time results with the accuracy of eye tracking were found to be approximately one degree of visual angle. It is specially designed for the Handicap people to use Computer or for controlling the wheelchair [28].

Dionis A. Padilla, Joseph Aaron B. Adriano, Jessie R. Balbin, implementing an eye tracking system on a Field Programmable Gate Array (FPGA) and specifically a text-typing application. Using state machines implemented in Verilog – a hardware description language and MATLAB for verification purposes, the said eye tracking system was built. The study comprises of several algorithms such as comprising of several processes such as Thinning Algorithm and Hough Transform was accomplished [29].

Kang Wang and Qiang Ji proposes a 3D model-based gaze estimation method with a single web-camera, which enables fast and portable eye gaze tracking. The key idea is to advantage on the proposed 3D eye-face model, from which we can determine 3D eye gaze from observed 2D facial landmarks. The proposed system includes a 3D deformable eye-face model that is learned offline from multiple training subjects. A real time eye tracking system running at 30 FPS also validates the effectiveness and ability of the proposed method [30].

Alexandru Pasarica, Radu Gabriel Bozomitu and Hariton Costin, Casian Miron, Cristian Rotariu presents the selection method implemented and the analysis of the precision of this method using a testing interface implemented in Matlab that allows the quantification of clicks/selections for different areas of the screen and the selection time required. In this tool used is human-computer interface, and dwell time which present the selection method which implemented and analysis [31].

Radu Jianu,and Sayeed Safayet Alam This paper introduces the DOI approach, and makes
necessary contributions towards applying it in practice. Three concrete examples of novel eye-tracking experiments enabled by the DOI approach in distinct domains (computer science, architecture instruction, and construction safety) using three different types of interactive visual content (2D, 3D, HTML). A formal DOI data model that builds on the generic EAV (entity-attribute-value) model, exemplified in the context of the three applications. A formal range of possible and probable questions that can be asked of DOI data [32].

Chandrika K R, Amudha J and Sithu D center on understanding the visual attention of subjects with programming skills and subjects without programming skills and recognize the eye tracking traits required for source code review. The key aspects of subjects with programming skills while source code review are required to have certain eye tracking features like better code coverage, attention span on error lines and comments [33].

Qing Mi, Jacky Keung, Jianglin Huang, Yan Xiao To bridge the research gap, we design an factual experiment in which eye tracking technology is introduced to quantitatively reflect developers’ cognitive efforts and mental processes when encountering the inconsistency issue. In this tool used is programming style, stylistic inconsistency, eye tracking technology, code readability, program comprehension [34].

Wen-Chung Kao, Jui-Che Tsai, and Yi-Chin Chiu proposes a parallel computing architecture for realizing a high precision algorithm on multicore microprocessor. The empirical result shows the proposed architecture could be applied to the design of a high speed VLGT with frame rate higher than 700 frames/s [35].

Jiancheng Zou and Honggen Zhang;Tengfan Weng a new approach is used which is combination image gradient information with threshold segmentation. Gradient detection and threshold segmentation are carried out in the region of interest, and the pupil and reflection spot are extracted directly. This paper use the centroid method to calculate the center coordinates more accurately. The algorithm used to develop human eye tracking system to achieve real-time eye tracking, while ensuring accuracy [36].

Anjith George and Aurobinda Routray The proposed method uses geometrical features of the eye. In the first stage, a fast convolution based approach is used for obtaining the coarse location of iris centre (IC). The IC location is further refined in the second stage using boundary tracing and ellipse fitting. The algorithm has been estimate in public databases like BioID, Gi4E and is found to outperform the state of the art methods [37].

(b) Nose Tracking: – The nose feature is defined as the point on the nose surface that is the closest to the camera. This point is termed the tip of the nose. Due to the symmetry and the convex shape of the nose, the nose feature is always visible in the camera, and it stays almost the same during the rotations of the head. It also does not change much with head moving towards and from the camera. Thus, the nose tip defined above cans always be located. This is very a important property of the nose which does not hold for any other facial feature [38].

Shadman Sakib Khan, Md. Samiul Haque Sunny, M. Shifat Hossain, Eklas Hossain, and Mohiuddin Ahmad Human Computer Interface (HCI) is implemented with digital image processing and a new method to control personal computers with high efficiency. The main three characteristics of this interface are nose tracking cursor control, auto brightness control and display control based on the presence and detection of valid human face. The proposed system is low cost and display inherent security and power saving capabilities [39].

Weiwei Zhang and Yi L.Murphey, Tianyu Wang, Qijie Xu proposes a yawning detection system that consists of a face detector, a nose detector, a nose tracker and a yawning detector. Deep learning algorithms are refining for detecting driver face area and nose location. A nose tracking algorithm that combines Kalman filter with a dedicate d open-source TLD (Track-Learning-Detection) tracker is developed to achieve robust tracking results under dynamic driving conditions [40].

S. Waphare, D. Gharpure and A.Shaligram, B. Botre propose the implementation of 2 novel algorithms named Surge-spiralx and Surge-castx on sniffer robot for odor plume tracking in a laminar wind environment. The algorithms are developed and they have shown very good performance in terms of success ratio, while Surge-Spiralx algorithm having less distance overhead[41].

Martin B¨ohme, Martin Haker, Thomas Martinetz, and Erhardt Barth propose a facial feature detector for time-off light n(TOF) cameras that last previous work by combining a nose detector based on geometric charters with a face detector. The goal is to prevent false detections outside the area of the face. They used a very simple classifier based on an axis-aligned bounding box in feature space; pixels whose feature values fall within the box are classified as nose pixels, and all other pixels are classified as “non-nose” [42].

(c) Mouth Tracking: -The term mouth tracking is used to include lip tracking as; the lips are a component of the mouth which contains other vital cues describing the mouth (ie. tongue, teeth, oral cavity). The lips however, act as an invaluable feature for tracking the mouth as in many cases the labial area gives a very good line of demarcation between the mouth and the face background [43].

Chris Fortuna, Christophe Giraud-Carrier and Joshua West proposed HTM algorithm which is designed to run continuously, actively counting bites throughout the day. They take a novel machine learning approach to customize the system to each individual user, and generate an average accuracy of 91.8%, well above the current state-of-the-art. They used us small set of 5 motion features, a Na¨ıve Bayes model [44].

Sunil S. Morade and B. Suprava Patnaik proposes a novel active contour guided geometrical feature extraction approach for lip reading. Three active contour approaches are snake, region scalable fitting energy method and localized active contour model. These approaches are adopted for salient geometrical feature calculation. A joint feature model, obtained by combinatining inner area, height and width has been proposed [45].

Luca Cappelletta and Naomi Harte a semi-automatic system based on nostril detection is presented. The system is creating to work on ordinary front videos and to be able to recover brief nostril occlusion. Using the nostril position a motion compensated Accumulated Difference Image (ADI) is generated. This ADI is less noisy than the non-compensated one, and this leads to better mouth region tracking [46].

Jie Cheng, Peisen Huang proposes a novel approach for real-time mouth tracking and 3D reconstruction. This method comprises two successive processing stages. In the first stage, an AdaBoost learning algorithm and a Kalman filter are used to detect and track the mouth region in real-time under a complex background. In the second stage, the resultant 2D position of the mouth is used to determine the region where the 3D shape is reconstructed by use of a digital fringe projection and modified Fourier transform method. The main grant of this paper is the real-time dense 3D reconstruction of the mouth region, which can be useful in many applications, such as lip-reading, biometrics, 3D animation, etc [47].

Zhilin Wu, Petar S. Aleksic, and Aggelos K. Katsaggelos propose approach consisting of a Gradient Vector Flow (GVF) snake with a parabolic template as an additional external force is proposed. Based on the results of the outer lip tracking, the inner lip is tracked using a similarity function and a temporal smoothness constraint [48].


Viola Jones Algorithm
Viola-Jones Algorithm is based on analysis the input image by means of sub window capable of detecting features. This window is extending to detect faces of distinct sizes in the image. Viola Jones developed a scale invariant detector which runs through the image many times, each time with particular size. Being scale invariant, the detector requires same number of calculations regardless of the size of the image. The system architecture of Viola Jones is based on a cascade of detectors. The first stages consist of simple detectors which ignore only those windows which do not contain faces. In the following stages the complication of detectors are increased to analysis the features in more detail. A face is detected only if it is observed through the entire cascade.

Viola-Jones face detection algorithm searches the detector several times through the same image – each time with a new size. The detector detects the non face area in an image and discards that area which results in detection of face area. To reject non face area Viola Jones take advantage of cascading.

The Viola–Jones algorithm is calculated for real – time detection of faces from an image. Its real – time performance is collect by using Haar type features, computed rapidly by using integral images, feature selection using the AdaBoost algorithm (Adaptive Boost) and face detection with conditional cascade.[49]

A. Feature calculation
Starting from the common features of the faces, such as the region around the eyes is darker than the cheeks or the region of the nose is brighter than those of the eyes, five Haar masks were chosen for estimating the features, measured at different positions and sizes. Haar features are calculated as the difference between the sum of the pixels from the white region and the sum of the pixels from the black. In this way, it is possible to detect contrast differences.

Type1 Type2 Type3 Type4 Type5

Figure: – Haar masks used

...(download the rest of the essay above)

About this essay:

This essay was submitted to us by a student in order to help you with your studies.

If you use part of this page in your own work, you need to provide a citation, as follows:

Essay Sauce, Face tracking. Available from:<> [Accessed 23-09-19].

Review this essay:

Please note that the above text is only a preview of this essay.

Comments (optional)

Latest reviews: