Abstract—One of the global main goal of the safety driving system is protecting the driver, passenger(s), car, and surrounding environment against accident which are caused by external and internal factors. Driver fatigue, one of the major internal factors, is a leading reason of traffic accident according to National Highway Traffic Safety Administration (NHTSA) survey. Thus, it is necessary to build driver fatigue monitoring system. We, then, propose a technique based on optical imaging through digital camera that installed on the car desk. The camera detects and tracks the driver face. From the driver face, we can apply non-contact photoplesthymography (PPG) in order to get multiple physiological signals such as heart rate (HR), heart rate variability (HRV), and respiratory rate (RR). Those physiological signals can be utilized to measure fatigue level. Changes in facial features such as eye, head, and mouth, can be used to observe the driver fatigue as well. We propose to use supervised descent method (SDM) with scale-invariant feature transform (SIFT) to extract the facial features. In order to classify the fatigue level from those multiple parameters, support vector machine (SVM) will be implemented.
Keywords—fatigue detection; safety driving; non-contact PPG; SVM
I. INTRODUCTION
Safe driving system is currently “magnetizing” various technologies for avoiding collision between a vehicle and another object around it [1]-[5]. Some common technologies have been developed for the save driving system, such as hazard alert system using infrastructure-to-vehicle communication [6], lateral control assistance system[7], and driver’s strain estimation [8].
All of those technologies in safe driving system are built for protecting the driver, passenger(s), car, and surrounding environment against accident which are caused by various external and internal factors. One of dominant of the internal factor is driver fatigue which has caused hundred thousand of car accident every year according to National Highway Traffic Safety Administration (NHTSA) [9]. Thus, some technologies for fatigue monitoring have been built in safe driving system [10]-[23].
Fatigue itself is a condition which performance capabilities are temporarily impaired by continual activity demands which exceed the ongoing capacity to restore performance capabilities [24]. Collision has high probability to occur when the driver of a vehicle suffers from psychological fatigue, temporary impairment of information acquisition and processing capabilities [25]. The physiological fatigue effects are manifested into four categories: motor (behavior), cognitive (perception and information processing), physiological (regulation of the vegetative and nervous system), subjective (experience). Except the subjective categories, all of the categories are typically attainable for observation and measurement [26].
According to the categories of fatigue effects, some fatigue monitoring have been built. From the cognitive categories, indirect behaviors of the vehicle, such as steering wheel movement, lateral position, have been used for development of fatigue detection [7], [10]-[13]. Driver’s perception and alertness are reduced when the driver in fatigue condition and it could be seen on indirect behaviors of the vehicle. This methods are not intrusive but, they are constrained with some limitations such as geometric characteristics, road states, vehicle type, etc. [27]. Monitoring changes in physical conditions in facial features can be used fatigue detection based on the motor categories. When people are in fatigue condition, they show some visual behaviors easily observable from changes in their facial features like eyes, head and face [14]-[18]. However, monitoring the movement of facial features using helmet, special contact lens, or any sensor that need physical contact are hard to implement [28]. The most promising fatigue detection technologies are based on the physiological categories, like brain signals, HR, HRV, and RR [29]. Although those technologies generally require attached sensor to the skin such as electrodes which may cause undesirable skin irritation, discomfort and soreness [30], [31].
Accordingly, in this paper we propose a novel for fatigue detection system based on the cognitive and physiological categories. In order to overcome the drawback some technologies based on the both of categories, we propose a non-contact method which use optical imaging from camera for obtain driver state condition. Furthermore, our proposed method combine several parameters of fatigue detections by using single camera and classified level of fatigue from those parameters by using SVM.
II. RELATED WORK
Research and development of fatigue detection techniques have been implemented in various cases, especially for safe driving assistance system. Some techniques are developed using image processing to obtain driver state and avoid the drawback of sensor that need physical contact with the driver. The other techniques focus on measuring driver performance. Some overview of those existing methods for fatigue detection are as follow:
A. Fatigue Monitoring based-on Driver Performance
Development of driver fatigue monitoring based on driver performances mostly focus on two types. The first type is based on lane tracking. The second type is based on combination distance between the vehicles with other vehicle nearby it. The second types usually combined with the first type.
Reference [10] has shown research for fatigue detection based on lane departure detection which was calculated by the distance between the vehicle and lane marker. In order to detect the lane, they applied some algorithm for image processing toward image that captured from the camera attached near the rear view mirror to focus on the road. When vehicle is running on the wrong side for a period of time, the lane departure, the driver fatigue is detected. In other research in [11], driver fatigue monitoring was based on a system identification model, using vehicle lateral position as the input and steering wheel position as the output. The system was used to develop a model and to continually update its parameters during driving. The changes in parameters of the model can be used as driver fatigue indicators. Research team in [12] utilized functional neuro-fuzzy network (FNFN) to estimate the distance by the location of the front vehicle. If the distance with the front vehicle is too short, the system will give an alarm to the driver which means the driver in fatigue condition. In [13], proposed a driver fatigue monitoring system using the driver’s pedal controlling pattern with respect to the driver’s front view situation. The driver fatigue level can be obtained from response patterns of driver’s pedal and the environment of front view.
Monitoring driver fatigue through driver performances are auspicious. These methods recognized indirectly the driver fatigue. Even though these methods are promising, they are have some limitations such as geometric characteristics, road states, vehicle type, etc. [27].
B. Fatigue Monitoring based-on Driver State
Some visual behaviors can be observe when people are in fatigue state. Their state can be seen on some changes of facial features like eyes, head and face. To extract information from the facial features, image processing on the driver’s face image is required.
In [14] driver fatigue detection system is based on driver’s eye state. The system used HIS color model to detect the driver’s face from the camera. Then, locate the driver’s eye position from the detected face by using Sobel edge operator. The last process was converting the obtained eye to the HSI model to distinguish eyeball pixels to detect whether the eyes are open or closed for determining driver fatigue. If the eyes closed more than 5 consecutive frame, the drive is in fatigue condition. This method is vulnerable toward light change. Research in [15] was focusing fatigue detection based on eyelid distance. The driver’s face was located using the characteristics of skin colors, then the eyes were tracked by using some dynamic templates. If the distance of eyelid is near to zero, the eye is concluded as close. The way it judged driver fatigue is same as in [14]. Meanwhile Q. Ji, Z. Zhu, and P. Lan had developed driver fatigue monitoring by utilizing information fusion, Bayessian networks (BN), from extract visual cues and contextual information. The extract visual cues included eyelid movement, head movement, gaze, and facial expression. While Contextual information contained physical fitness, sleep history, time of day, and temperature. A probabilistic model was developed to model human fatigue and to predict fatigue based on those information. Furthermore, this method was strengthened by using IR-camera, so it could work even in dark condition [16]. The other researcher was also trying to use other facial features, which was mouth. L. Li, Y. Chen, and Z. Li had developed yawning detection for monitoring driver fatigue. The mouth was located by using haar-like features. A resolution ratio of the detected mouth height and width was used for detecting yawning [17]. Like in [16], Xiao Fan, Baocai Yin, and Yanfeng Sun was using BN to determine confidence level of driver fatigue from two visual cues. The visual cues were eye and mouth which were detected by using local binary pattern (LBP) features and Gabor features with LDA respectively [18]. The disadvantage of methods that based on mouth is not all driver in fatigue are yawning and some people have behavior to close the mouth when yawning.
C. Fatigue Monitoring based-on Driver Physiological
The promising fatigue detection is by analyzing physiological signals [29]. Physiological signals that can be analyzed for fatigue detection are electroencephalogram (EEG), electrooculography (EOG), HRV, etc.
A research done by S.K.L. Lal, A. Craig, P. Boord, L. Kirkup, and H. Nguyen showed substantial relationships between an EEG algorithm for detecting fatigue and drowsiness under simulated conditions [21]. The other researcher team, Noguchi, R. Nopsuwanchai, M. Ohsuga, and Y. Kamakura, showed the possibility of assessing driver’s arousal level by using the results of EOG waveform to analyze blinking image sequences [22]. The other researcher, G. Li, and W. Y. Chung, analyzed driver fatigue by using HRV wavelet and classified it with SVM [23].
As mentioned before, those method which used physiological signals have best promising accuracy. However, most of those methods relied on electrodes that have to be attached to the driver. It was not comfortable method and may cause some undesirable skin irritation, and soreness [30]-[31]
III. PROPOSED SYSTEM
In this paper we propose a system for measuring fatigue detection by fusing several parameters from driver’s state and physiological signal. To get those parameters, our proposed system will do optical image processing toward driver’s face. From driver’s facial features we can get information related to the fatigue from visual behavior of eye, mouth, and the head pose. Not only the visual behavior but it is also possible to get physiological parameters from facial image sequences such as HR, HRV, and RR [32]. The driver’s facial images will be captured by an IR camera that installed inside the car facing toward the driver. An illustration of the IR camera’s placement can be seen on Figure 1. Every fatigue detection from those parameters will be sent to the SVM in order to classify the fatigue level. Detail explanation of our proposed system as follows:
Figure 1 An illustration of infrared camera as fatigue
detection sensor on safe driving car
A. Face Detection and Facial Features Extraction
First of all, our system required driver’s face image to process all of the other parts. For initialization, we will use Viola-Jones object detection method and boosted cascade classifier for frontal face. The Viola-Jones object detection method returned x- and y-coordinates along with the height and width that define a rectangle around the face image of user [33].
To allocate key points of facial features from the captured face image, we plan to use scale-invariant feature transform (SIFT). SIFT key point of facial features from a set of reference facial images are extracted and stored in a database [34]. We used 49 key points to allocate face shape, nose, eyes, eye brows, and mouth.
In order to enhance face image alignment using SIFT, we propose to use SDM for minimizing a non- linear least squares (NLS) function. As known, the SIFT image features is a non-linear function. The SDM learns some descent directions that minimizes the mean of NLS functions sampled at different points. When implementing for face detection and tracking, SDM minimizes the NLS objective using the learned descent directions without computing the Jacobian nor the Hessian. Thus, the computation of face detection and facial features extraction can be done in very fast [35]
B. Fatigue Detection on Driver’s State
After the facial features has been extracted from a sequence image of the face, we can do analysis of fatigue toward eyes, mouth, and head-pose with some methods The methods are various depend on the facial features that will be analyzed. The method that we propose are as follows:
1) Fatigue analysis on eyes: After the eyes region has been detected, we got some points on the top and bottom of eyelids. When eyes are closing, distance between those points will be decreased. We set up if the distance of the eyelids decrease more than 80%, then the eyes are concluded in close state. In reality, the distance should be 0 but, considering about the eye leashes can affect the distance calculation. The fatigue can be detected from the eye blink rate [36]. The fatigue can be detected when the eye blink rate is low than 8 blinks/min [37].
2) Fatigue analysis on mouth: The driver fatigue can be detected from mouth behavior when the mouth is yawning. The yawning could be detected from the ratio of mouth’s height and width. The width of mouth is obtained from distance between mouth corner point, and the height is obtained from the top and bottom of lips. We set up a threshold 0.5 for ratio of mouth’s height and lips. If the ratio is lower than 0.5 in more than 20 consecutive frames, it is not concluded as fatigue.
3) Fatigue analysis on head pose: The other visual behavior of people in fatigue is nodding. Nodding can be obtained from the head pose estimation, especially head pitch rotation. The head pose is estimated by calculating position of eyes, nose point, and mouth. Nodding and headshaking movement are similar. To avoid it, we adopt method from Shinjiro Kawato and Jun Ohya [38]. First, each frame is categorized to a “stable”, “transient”, or “extreme” state. If the head pitch changes is under 20% in 1 seconds then the current frame is stable. If the head pitch reaches maximum degree of pitch rotation then it is in extreme state. Otherwise, the frame is in a transient state. The last frame and the previous one have no state. When the state changes from non-stable to stable, the nodding evaluation process is triggered, so the detection has a two frame delay. If there are more than two extreme state between current stable state and the previous stable state, and all adjacent extreme states differ with more than 20% in pitch rotation, then the system assumes that head-shaking has occurred.
C. Fatigue Detection on Driver’s Physiological
In order to get physiological signal such as HR, HRV, and RR from a sequence of facial images, we have to extract the blood volume pulse (BVP) that under lied on the surface of the facial skin. Non-contact PPG can be applied for extracting the PPG. Conventional PPG measured BVP based on the variations of transmitted or reflected light on the surface of the skin using a couple of optical sensor such as IR and photodiode [39]. In our proposed system, we replace the couple of optical sensor using IR camera. By capturing the sequence of facial images with the camera, the image sensors collect the reflected light signal along with noise because of some artifacts. Thus, corresponding variations in brightness of the facial skin area indicate the cardiovascular events. As a result physiological signal can be formed in time series. To recover BVP from that signals, first, we adopt method from Fang Zhao, Meng Li , Yi Qian, and Joe Z. Tsien, to reconstruct a matrix of state space that is equivalent to the original state space composed of all the dynamic variables from only a single signal [40]. Then, we are going to do blind signal separation (BSS) to get the underlying signal. Instead of using independent component analysis (ICA) for doing BSS as in [40], we use principal component analysis (PCA) which less computationally complex and able to recover BVP in non-contact PPG using camera, so it is reasonable to choose that method to reduce time and complexity of analysis [41].
To get HRV from the BVP, firstly we filter the signal using Butterworth filter with window size 15 and bandwidth 0.75~4 Hz. To attain the BVP peak point, the signal was interpolated with a cubic spline function at a sampling frequency of 256 Hz. Finally, we detect every peak in the interpolated signals and calculate the distance from peak-to-peak to form HRV.
The RR can be estimated from the HRV power